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Preface 


The title of this book indicates a dual purpose. Our first aim is to introduce 
fundamental ideas of algebraic numbers. The second is to tell one of the 
most intriguing stories in the history of mathematics—the quest for a proof 
of Fermat’s Last Theorem. We use this celebrated theorem to motivate 
a general study of the theory of algebraic numbers, from a reasonably 
concrete point of view. The range of topics that we cover is selected to allow 
students to make early progress in understanding the necessary concepts. 
‘Algebraic Number Theory’ can be read in two distinct ways. One 
is the theory of numbers viewed algebraically, the other is the study of 
algebraic numbers. Both apply here. We illustrate how basic notions from 
the theory of algebraic numbers may be used to solve problems in number 
theory. However, our main focus is to extend properties of the natural 
numbers to more general number structures: algebraic number fields, and 
their rings of algebraic integers. These structures have most of the standard 
properties that we associate with ordinary whole numbers, but some subtle 
properties concerning primes and factorization sometimes fail to generalize. 
A Diophantine equation (named after Diophantus of Alexandria, who— 
it is thought—lived around 250 and whose book Arithmetica systematized 
such concepts) is a polynomial equation, or a system of polynomial equa- 
tions, that is to be solved in integers or rational numbers. The central 
problem of this book concerns solutions of a very special Diophantine 
equation: 
e+ y” = 2 
where the exponent n is a positive integer. For n = 2 there are many integer 
solutions—in fact, infinitely many—which neatly relate to the theorem of 
Pythagoras. For n > 3, however, there appear to be no integer solutions. 
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It is this assertion that became known as Fermat’s Last Theorem. (It is 
equivalent to there being no rational solutions—try to work out why.) 

One method of attack might be to imagine the equation x” + y” = z” 
as being situated in the complex numbers, and to use the complex nth root 
of unity ¢ = e?**/" to obtain the factorization (valid for odd n) 


a +y" = (z+y)(a+Cy)...(c+C""y). 


This approach entails introducing algebraic ideas, including the notion of 
factorization in the ring Z[¢] of polynomials in ¢. This promising line of 
attack was pursued for a time in the 19th century, until it was discovered 
that this particular ring of algebraic numbers does not possess all of the 
properties that it ‘ought to’. In particular, factorization into ‘primes’ is 
not unique in this ring. (It fails, for instance, when n = 23, although this 
is not entirely obvious.) It took a while for this idea to be fully understood 
and for its consequences to sink in, but as it did so, the theory of algebraic 
numbers was developed and refined, leading to substantial improvements 
in our knowledge of Diophantine equations. In particular, it became pos- 
sible to prove Fermat’s Last Theorem in a whole range of special cases. 
Subsequently, geometric methods and other approaches were introduced to 
make further gains, until, at the end of the 20th century, Andrew Wiles 
finally set the last links in place to establish the proof after a three hundred 
year search. 

To gain insight into this extended story we must assume a certain 
level of algebraic background. Our choice is to start with fundamental 
ideas that are usually introduced into algebra courses, such as commuta- 
tive rings, groups and modules. These concepts smooth the way for the 
modern reader, but they were not explicitly available to the pioneers of 
the theory. The leading mathematicians in the 19th and early 20th cen- 
turies developed and used most of the basic results and techniques of linear 
algebra—for perhaps a hundred years—without ever defining an abstract 
vector space. There is no evidence that they suffered as a consequence of 
this lack of an explicit theory. This historical fact indicates that abstraction 
can be built only on an already existing body of specific concepts and rela- 
tionships. This indicates that students will profit from direct contact with 
the manipulation of examples of number-theoretic concepts, so the text is 
interspersed with such examples. The algebra that we introduce—which 
is what we consider necessary for grasping the essentials of the struggle to 
prove Fermat’s Last Theorem—is therefore not as ‘abstract’ as it might be. 
We believe that in mathematics it is important to ‘get your hands dirty’. 
This requires struggling with calculations in specific contexts, where the 
elegance of polished theory may disguise the essential nature of the math- 
ematics. For instance, factorization into primes in specific number fields 
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displays the tendency of mathematical objects to take on a life of their own. 
In some situations something works, in others it does not, and the reasons 
why are often far from obvious. Without experiencing the struggle in per- 
son, it is quite impossible to understand why the pioneers in algebraic 
number theory had such difficulties. Of such frustrating yet stimulating 
stuff is the mathematical fabric woven. 

We therefore do not begin with later theories that have proved to be 
of value in a wider range of problems, such as Galois theory, valuation rings, 
Dedekind domains, and the like. Our purpose is to get students involved 
in performing calculations that will enable them to build a platform for 
understanding the theory. However, some algebraic background is neces- 
sary. We assume a working knowledge of a variety of topics from algebra, 
reviewed in detail in Chapter 1. These include commutative rings and 
fields, ideals and quotient rings, factorization of polynomials with real coef- 
ficients, field extensions, symmetric polynomials, modules, and free abelian 
groups. Apart from these concepts we assume only some elementary results 
from the theory of numbers and a superficial comprehension of multiple 
integrals. 

For organizational reasons rather than mathematical necessity, the book 
is divided into four parts. Part I develops the basic theory from an algebraic 
standpoint, introducing the ring of integers of a number field and exploring 
factorization within it. Quadratic and cyclotomic fields are investigated 
in more detail, and the Euclidean imaginary fields are classified. We then 
consider the notion of factorization and see how the notion of a ‘prime’ 
p can be pulled apart into two distinct ideas. The first is the concept 
of being ‘irreducible’ in the sense that p has no factors other than 1 and 
p. The second is what we now call ‘prime’: that if p is a factor of the 
product ab (possibly multiplied by units—invertible elements) then it must 
be a factor of either a or b. In this sense, a prime must be irreducible, 
but an irreducible need not be prime. It turns out that factorization into 
irreducibles is not always unique in a number field, but useful sufficient 
conditions for uniqueness may be found. The factorization theory of ideals 
in a ring of algebraic integers is more satisfactory, in that every ideal is a 
unique product of prime ideals. The extent to which factorization is not 
unique can be ‘measured’ by the group of ideal classes (fractional ideals 
modulo principal ones). 

Part II emphasizes the power of geometric methods arising from Min- 
kowski’s theorem on convex sets relative to a lattice. We prove this key 
result geometrically by looking at the torus that appears as a quotient of 
Euclidean space by the lattice concerned. As illustrations of these ideas 
we prove the two- and four-squares theorems of classical number theory; as 
the main application we prove the finiteness of the class group. 
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Part III concentrates on applications of the theory thus far developed, 
beginning with some slightly ad hoc computational techniques for class 
numbers, and leading up to a special case of Fermat’s Last Theorem that 
exemplifies the development of the theory by Kummer, prior to the final 
push by Wiles. 

Part IV describes the final breakthrough, when—after a long period 
of solitary thinking—Wiles finally put together his proof of Fermat’s Last 
Theorem. Even this tale is not without incident. His first announcement 
in a lecture series in Cambridge turned out to contain a subtle unproved 
assumption, and it took another year to rectify the error. However, the 
proof is finally in a form that has been widely accepted by the mathematical 
community. In this text we cannot give the full proof in all its glory. 
Instead we discuss the new ingredients that make the proof possible: the 
ideas of elliptic curves and elliptic integrals, and the link that shows that 
the existence of a counterexample to Fermat’s Last Theorem would lead 
to a mathematical construction involving elliptic integrals. The proof of 
the theorem rests upon showing that such a construction cannot exist. We 
end with a brief survey of later developments, new conjectures, and open 
problems. 

There follow two appendices which are of importance in algebraic num- 
ber theory, but do not contribute directly to the proof of Fermat’s Last The- 
orem. The first deals with quadratic residues and the quadratic reciprocity 
theorem of Gauss. It uses straightforward computational techniques (de- 
ceptively so: the ideas are very clever). It may be read at an early stage— 
for example, right at the beginning, or alongside Chapter 3 which is rather 
short: the two together would provide a block of work comparable to the 
remaining chapters in the first part of the book. The second appendix 
proves the Dirichlet Units Theorem, again a beacon in the development of 
algebraic number theory, but not directly required in the proof of Fermat’s 
Last Theorem. 

A preliminary version of Parts I-III of the book was written in 1974 
by Ian Stewart at the University of Tiibingen, under the auspices of the 
Alexander von Humbolt Foundation. This version was used as the basis of 
a course for students in Warwick in 1975; it was then revised in the light 
of that experience, and was published by Chapman and Hall. That edition 
also benefited from the subtle comments of a perceptive but anonymous 
referee; from the admirable persistence of students attending the course; 
and from discussions with colleagues. The book has been used by successive 
generations of students, and a second edition in 1986 brought the story up 
to date—at that time—and corrected typographical and computational 
errors. 
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In the 1980s a proof of Fermat’s Last Theorem had not been found. 
In fact, graffiti on the wall of the Warwick Mathematics Institute declared 
‘I have a proof that Fermat’s Last Theorem is equivalent to The Four 
Colour Theorem, but this wall is too small for me to write it.’ Since that 
time, both Fermat’s Last Theorem and the Four Colour Theorem have 
fallen, after centuries of effort by the mathematical community. The final 
conquest of Fermat’s Last Theorem required a new version that would 
give a reasonable idea of the story behind the complete saga. This new 
version, brought out with a new publisher, is the result of further work 
to bring the book up to date for the 21st century. It involved substantial 
rewriting of much of the material, and two new chapters on elliptic curves 
and elliptic functions. These topics, not touched upon in previous editions, 
were required to complete the final solution of the most elusive conundrum 
in pure mathematics of the last three hundred years. 


Coventry, February 2001. Ian Stewart 
David Tall 
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The Origins of 
Algebraic Number Theory 


Numbers have fascinated civilized man for millennia. The Pythagoreans 
studied many properties of the natural numbers 1, 2,3,... , and the famous 
theorem of Pythagoras, though geometrical, has a pronounced number- 
theoretic content. Earlier Babylonian civilizations had noted empirically 
many so-called Pythagorean triads, such as 3, 4, 5 and 5, 12, 13. These are 
natural numbers a, b,c such that 


a? +b? =e’. (1) 


A clay tablet from about 1500 B.C. includes the triple 4961, 6480, 8161, 
demonstrating the sophisticated techniques of the Babylonians. 

The Ancient Greeks, though concentrating on geometry, continued to 
take an interest in numbers. In c. 250 A.D. Diophantus of Alexandria wrote 
a significant treatise on polynomial equations which studied solutions in 
fractions. Particular cases of these equations with natural number solutions 
have been called Diophantine equations to this day. 

The study of algebra developed over the centuries, too. The Hindu 
mathematicians dealt with increasing confidence with negative numbers 
and zero. Meanwhile the Moslems conquered Alexandria in the 7th century, 
sweeping across north Africa and Spain. The ensuing civilization brought 
an enrichment of mathematics with Moslem ingenuity grafted onto Greek 
and Hindu influence. The word ‘algebra’ itself derives from the arabic title 
‘al jabr w’al muqabalah’ (literally ‘restoration and equivalence’) of a book 
written by Al-Khowarizmi in c. 825. Peaceful coexistence of Moslem and 
Christian led to the availability of most Greek and Arabic classics in Latin 
translations by the 13th century. 
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In the 16th century, Cardano used negative and imaginary solutions in 
his famous book Ars Magna, and in succeeding centuries complex numbers 
were used with greater understanding and flexibility. 

Meanwhile the theory of natural numbers was not neglected. One of the 
greatest number theorists of the 17th century was Pierre de Fermat (1601— 
1665). His fame rests on his correspondence with other mathematicians, for 
he published very little. He would set challenges in number theory based 
on his own calculations; and at his death he left a number of theorems 
whose proofs were known, if at all, only to himself. The most notorious of 
these was a marginal note in his own personal copy of Diophantus, written 
in Latin, which translates: 


To resolve a cube into [the sum of] two cubes, a fourth power into 
fourth powers, or in general any power higher than the second into 
two of the same kind, is impossible; of which fact I have found a 
remarkable proof. The margin is too small to contain it ... 


More precisely, Fermat asserted, in contrast to the case of Pythagorean 
triads, that the equation 


aM ty" =a (2) 


has no integer solutions z, y, z (other than the trivial ones with one or more 
of x,y, z equal to zero). 

In the years following Fermat’s death, almost all of his stated results 
were furnished with a proof. An exception was his claim that F,, = 2?" +1 
is prime for all positive integers n. It was subsequently shown that he was 
wrong: for instance, Fs is divisible by 641. However, this does not of itself 
show that Fermat was fallible, for he never claimed that he had a proof 
of this conjecture. One by one his other assertions were furnished with 
proofs until, by the mid-19th century, only one elusive jewel remained. His 
statement about the non-existence of solutions of (2) for n > 3 exceeded the 
powers of all 19th century mathematicians. This beguiling and infuriating 
assertion, so simple to state, yet so subtle in its labyrinthine complexity, 
became known as ‘Fermat’s Last Theorem’. This romantic epithet is in fact 
doubly inappropriate for, without a proof, it was not a ‘theorem’, neither 
was it the last result that Fermat studied—only the last to remain unproved 
by other mathematicians. 

Given that a proof is so elusive, is it really credible that Fermat could 
have possessed a genuine proof—a clever way of looking at the problem 
which eluded later generations? Or had he made a subtle error, which 
passed unnoticed, so that his ‘theorem’ had no proof at all? No one knows 
for sure, but there is a strong consensus that if he did have what he thought 
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was a proof, it would not survive modern scrutiny. Consensus and certainty 
are not the same thing, however. 

Be that as it may, during the late 19th and early 20th centuries the 
name stuck, with its glow of romanticism—somehow lacking in the more 
appropriate title ‘the Fermat Conjecture’. It had the two classic ingredients 
of a problem that can capture the imagination of a wider public—a simple 
statement that can be widely understood, but a proof that defeats the 
greatest intellects. 

Another classic problem of this type—‘the impossibility of trisecting 
an angle using ruler and compasses alone’—took two thousand years to 
be solved. This was posed by the Greeks in studying geometry and was 
solved in the early 19th century using algebraic techniques. In the same 
way the advancement in the solution of Fermat’s Last Theorem has moved 
out of the original domain, the theory of natural numbers, to a different 
area of mathematical study, algebraic numbers. In the 19th century the 
developing theory of algebra had matured to a state where it could usefully 
be applied in number theory. 

As it happened, Fermat’s Last Theorem was not the major problem 
being attacked by number theorists at the time; for example, when Kummer 
made the all-important breakthrough that we are to describe in this text, 
he was working on a different problem, a topic called the ‘higher reciprocity 
laws’. At this stage it is worth making a minor diversion to look at this 
subject, for it was here that algebraic numbers entered number theory in 
the work of Gauss. As an eighteen-year-old, in 1796 Gauss had given the 
first proof of a remarkable fact observed empirically by Euler in 1783. Euler 
had addressed himself to the problem of when an integer g was congruent 
to a perfect square modulo a prime p, 


x” = q(mod p). 


In such a case, q is said to be a quadratic residue of p. Euler concentrated 
on the case when p,q were distinct odd primes and noted: if at least one 
of the odd primes p,q is of the form 4r + 1, then g is a quadratic residue 
of p if and only if p is a quadratic residue of g; on the other hand, if both 
p, q are of the form 4r + 3, then precisely one is a quadratic residue of the 
other. 

Because of the reciprocal nature of the relationship between p and gq, 
this result was known as the quadratic reciprocity law. Legendre attempted 
a proof in 1785 but assumed that certain arithmetic progressions contained 
infinitely many primes—a theorem whose proof turned out to be far deeper 
than the quadratic reciprocity law itself. Legendre introduced the symbol 


_ 1 if gq is a quadratic residue of p 
(a/p) = { -1 if not, 
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in terms of which the law becomes 


(/p)(p/a) = (IOV. 


When Gauss gave the first proof in 1796 he was dissatisfied because 
his method did not seem a natural way to attack so seemingly simple a 
theorem. He went on to give several more proofs, two of which appeared 
in his book Disquisitiones Arithmeticae (1801), a definitive text on number 
theory which still remains in print, Gauss [29]. His second proof depends 
on a numerical criterion that he discovered, and we give a computational 
proof depending on this criterion in Appendix A. 

Between 1808 and 1832 Gauss continued to look for similar laws for 
powers higher than squares. This entailed looking for relationships between 
p and q so that q was a cubic residue of p (x? = q(mod p)) or a biquadratic 
residue («* = q(mod p)), and so on. He found certain higher reciprocity 
laws, but in doing so he discovered that his calculations were made easier 
by working over the Gaussian integers a + bi (a,b € Z,i = /—1), rather 
than the integers alone. He developed a theory of prime factorization for 
these, proved that decomposition into primes was unique, and developed 
a law of biquadratic reciprocity. In the same way he considered cubic 
reciprocity by using numbers of the form a+ bw where w = e(7*)/3, These 
higher reciprocity laws do not have the same striking simplicity as quadratic 
reciprocity and we shall not study them in this text. But Gauss’ use of 
these new types of number is of fundamental importance in Fermat’s Last 
Theorem, and the study of their factorization properties is a deep and 
fruitful source of methods and problems. 

The numbers concerned are all examples of a particular type of complex 
number; namely, one which is a solution of a polynomial equation 


Ont” +...+4,2+ a =0 


where all the coefficients are integers. Such a complex number is said to 
be algebraic, and if a, = 1, it is called an algebraic integer. Examples of 
algebraic integers include i (which satisfies z? + 1 = 0), /2(z? — 2 = 0) 
and more complicated examples, such as the roots of 2” — 26523 + 7x? — 
2x + 329 = 0. The number 43 (satisfying 4x? + 1 = 0) is algebraic but not 
an integer. On the other hand, there are complex numbers which are not 
algebraic, such as e or 7. 

In the wider setting of algebraic integers we can factorize a solution of 
Fermat’s equation 2” + y” = z” (if one exists) by introducing a complex 
nth root of unity, ¢ = e?**/", and writing (2) as 


(2t+y)(at¢y)...(@+¢" ly) =z". (3) 
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If Z[¢] denotes the set of algebraic integers of the form ag +ai¢+...+4a,¢" 
where each a, is an ordinary integer, then this factorization takes place in 
the ring Z[C]. 

In 1847 the French mathematician Lamé announced a ‘proof’ of Fer- 
mat’s Last Theorem. In outline his proposal was to show that only the case 
where z,y have no common factors need be considered; and then deduce 
that in this case r+y,2+(Cy,...,2+C% ly have no common factors, that 
is, they are relatively prime. He then argued that a product of relatively 
prime numbers in (3) can be equal to an nth power only if each of the 
factors is an nth power. So 


zt+y = wy 
zr+Cy = up 
e+(r) = un (4) 


On this basis Lamé derived a contradiction. 

It was immediately pointed out to him by Liouville that the deduction 
of (4) from (3) assumed uniqueness of factorization in a very subtle way. 
Liouville’s fears were confirmed when he later received a letter from Kum- 
mer who had shown that uniqueness of factorization fails in some cases, 
the first being n = 23. Over the summer of 1847 Kummer went on to 
devise his own proof of Fermat’s Last Theorem for certain exponents n, 
surmounting the difficulties of non-uniqueness of factorization by introduc- 
ing the theory of ‘ideal’ complex numbers. In retrospect this theory can 
be viewed as introducing numbers from outside Z[¢] to use as factors when 
factorizing elements within Z[¢]. These ‘ideal factors’ restore a version of 
unique factorization. 

Subsequently the theory began to take on a different form from that 
in which Kummer left it, but the key concept of an ‘ideal’ (a reformula- 
tion by Dedekind of Kummer’s ‘ideal number’) gave the theory a major 
boost. By using his theory of ideal numbers, Kummer proved Fermat’s 
Last Theorem for a wide range of prime powers—the so-called ‘regular’ 
primes. He also evolved a powerful machine with applications to many 
other problems in mathematics. In fact a large part of classical num- 
ber theory can be expressed in the framework of algebraic numbers. This 
point of view was urged most strongly by Hilbert in his Zahlbericht of 1897, 
which had an enormous influence on the development of number theory, 
see Reid [57]. As a result, algebraic number theory is today a flourishing 
and important branch of mathematics, with deep methods and insights; 
and—most significantly—applications not only to number theory, but also 
to group theory, algebraic geometry, topology, and analysis. It was these 
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wider links that eventually led to the final proof of Fermat’s Last Theo- 
rem—establishing it once and for all as a theorem, not a conjecture. The 
eventual proof was made possible by various significant inroads, which were 
made using techniques from elliptic functions, modular forms, and Galois 
representations. 

The breakthrough, as indicated above, was made by Andrew Wiles. 
As a teenager, fascinated by the simplicity of the statement of the theo- 
rem, Wiles had begun a long and mostly solitary journey in search of the 
proof. The event that triggered his final push was a conjecture put forward 
by two Japanese mathematicians, Yutaka Taniyama and Goro Shimura, 
who hypothesized a link between elliptic curves and modular forms. Their 
ideas were later refined by André Weil. This proposal became known as 
the Taniyama-Shimura—Weil Conjecture, and it was discovered that if this 
conjecture could be proved, then Fermat’s Last Theorem could be deduced 
from it. At this point, Wiles leaped into action. He worked in solitude for 
seven years before he convinced himself that he had proved a special case of 
the Taniyama-Shimura—Weil Conjecture that was strong enough to imply 
Fermat’s Last Theorem. He announced his result in a lecture in Cambridge 
on 23 June 1993. 

When his proof was being checked, a query from a colleague revealed a 
gap, and Wiles accepted that there were still details to be attended to. It 
took him so long to do this that others questioned whether he had ever been 
close to the proof at all. However, in the autumn of 1994, working with his 
former student Richard Taylor, he finally realised that he could complete 
the proof satisfactorily. He released the proof for scrutiny in October 1994 
and it was published in May 1995. 

Fermat’s Last Theorem has the distinction of being the theorem with 
the greatest number of false ‘proofs’, so the proof was scrutinized very 
carefully. However, this time the ideas fit together so tightly that experts 
in the mathematical community agreed that all was well. In the ensuing 
period nothing has happened to change this opinion: Fermat’s Last The- 
orem has at last been declared true. However, the proof uses techniques 
that lie far beyond what would have been available to Fermat. So when he 
stated that he had found a proof that could not be fitted into the margin of 
his book, had he truly found a perceptive insight that has been missed by 
mathematicians for over 350 years? Or was it, as observed by the historian 
Struik [74], that ‘even the great Fermat slept sometimes’? 
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Algebraic Background 


Fermat’s Last Theorem is a special case of the theory of Diophantine equa- 
tions—integer solutions of polynomial equations. To place the problem in 
context, we move to the wider realm of algebraic numbers, which arise as 
the real or complex solutions of polynomials with integer coefficients; we 
focus particularly on algebraic integers, which are solutions of polynomials 
with integer coefficients where the leading coefficient is 1. For example, the 
equation x? — 2 = 0 has no integer solutions, but it has two real solutions, 
xz = +/2. The leading coefficient of the polynomial x? — 2 is 1, so +/2 
are algebraic integers. 

To operate with such numbers, it is useful to work in subsystems of the 
complex numbers that are closed under the usual operations of arithmetic. 
Such subsystems include subrings (which are closed under addition, sub- 
traction and multiplication) and subfields (closed under all four arithmetic 
operations including division). Thus along with +,/2 we consider the ring 
of all numbers a + b,/2 for a,b € Z, and the field of all numbers p + qV/2 
for p,q € Q. 

In this chapter we lay the foundations for algebraic number theory by 
considering some fundamental facts about rings, fields, and other algebraic 
structures, including abelian groups and modules, which are relevant to 
our theoretical development. We expect the reader to be acquainted with 
elementary properties of groups, rings and fields, and to have an elementary 
knowledge of linear algebra over an arbitrary field (up to simple properties 
of determinants). Familiar results at this level will be stated without proof; 
results which we consider less familiar to some readers may be proved in 
full or in outline as we consider appropriate. Results not proved in full are 
given appropriate references, in case the reader wishes to pursue them in 
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greater depth. Useful general references on abstract algebra, with emphasis 
on rings and fields, are Fenrick [24], Fraleigh [25], Jacobson [40], Lang [43], 
Sharpe [67], and Stewart and Tall [72]. For group theory, see Burn [11], 
Humphreys [38], Macdonald [45], Neumann e¢ al. [55], and Rotman [63]. 

First we set up the ring-theoretic language (and in particular the notion 
of an ideal, which proves to be so important). Then we consider factoriza- 
tion of polynomials over a ring (which in this book will often be a subfield 
of the complex numbers). Topics of central importance at this stage are 
the factorization of a polynomial over an extension field and the theory of 
elementary symmetric polynomials. Module-theoretic language will help us 
clarify certain points later; and results concerning finitely generated abelian 
groups are proved because they are vital in describing the properties of the 
additive group structure of the subrings of the complex numbers which 
occur. With the prologue of the first chapter behind us we shall then be 
ready to begin the main action. 


1.1 Rings and Fields 


Unless explicitly stated to the contrary, the term ring in this book will 
always mean a commutative ring R with identity element 1 (or 1g). If 
such a ring has no zero-divisors (so that in R, a # 0, b #0 implies ab 4 0), 
and if 1 £ 0 in it, then it will be called a domain (or integral domain). An 
element a in a ring RZ is called a unit if there exists b € R such that ab = 1. 
Suppose ab = ac = 1. Then c = lc = abc = acb = 1b = b. The unique b 
such that ab = 1 will be denoted by a~!, and ca! will also be denoted by 
c/a. If 1 #0 in R and every non-zero element in R is a unit, then R will 
be called a field. 

We shall use the standard notation N for the set of natural numbers 
0,1,2,..., Z for the integers, Q for the rationals, R for the reals and C 
for the complex numbers. Under the usual operations Q,R, C are fields, 
Z is a domain and N is not even a ring. For n € N, n > 0, we denote 
the ring of integers modulo n by Z,,. If n is composite, then Z,, has zero 
divisors, but for n prime, then Zp, is a field (Fraleigh [25] p. 217; Stewart 
[71] Theorem 1.1, p. 3). 

A subring S$ of a ring R will be required to contain 1g. We can check 
that S is a subring by demonstrating that lr € S, and if s,t € S then 
s+t,—s,st € S. It then forms a ring in its own right under the operations 
restricted from R. In the same way, if K is a field, then a subfield F of K 
is a field under the operations restricted from K. We can check that F is 
a subfield of K by demonstrating 1, € F, andifs, t¢€F (s #0), then 
s+t, —s, st, s EF. 
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The concept of an ideal will be of central importance in this text. Recall 
that an ideal is a non-empty subset J of a ring R such that if r,s € J, then 
r—s € TI; andifr € R, s € I then rs € I. We shall also require the 
concept of the quotient ring R/I of R by an ideal I. The elements of 
R/I are cosets I +r of the additive group of I in R, with addition and 
multiplication defined by 


(I+r)+(I+s) = I+(r+s) 
(I+r)I+s) = I+rs 


for all r,s € R. For example, if nZ is the set of integer multiples of n € Z, 
then Z/nZ is isomorphic to Zn. 

A homomorphism f from a ring R, to a ring Rez is a function f : Ry > 
Ry such that 


f(1r,) = 1p, 
f(r+s) = f(r) +f(s) 
f(rs) = f(r) f(s) 


for all r,s € R,. A monomorphism is an injective (1-1) homomorphism 
and an isomorphism is a bijective (1-1 and onto) homomorphism. 

The kernel and image of a homomorphism f are defined in the usual 
way: 


ker f = {r € R, | f(r) = 0} 
im f={f(r)e R.|re Ry}. 


The kernel is an ideal of Ri; the image is a subring of Re; and the iso- 
morphism theorem states that there is an isomorphism from R,/kerf to 
imf. (Students requiring explanations of the relevant theory may consult 
Fraleigh [25], Jacobson [40], or Sharpe [67].) 

If X and Y are subsets of a ring R we write X + Y for the set of 
all elements z+ y(x € X,y € Y); and XY for the set of all finite sums 
Lay: (a, € X,y; € Y). When X and Y are both ideals, so are X + Y and 
XY. 

The sum X + Y of two sets can be generalized to an arbitrary collection 
{Xi}ier by defining U;<7X; to be the set of all finite sums 2;, +... + 2:,, 
of elements x;, € Xi;. 

We shall make the customary compression of notation with regard to 
{x} and x, writing for example xY for {x}Y, x+Y for {x} + Y, and 0 
for {0}. 

The ideal generated by a subset X of R is the smallest ideal of R con- 
taining X; we shall denote this by (X). If X = {a1,... ,2n}, then we shall 
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write (X) as (x1,...,2%n). (Some writers use (X) where we have written 
(X), but then the last-mentioned simplification of notation would reduce 
to the notation for an n-tuple (71,... , 2%), so (X) is to be preferred.) 

A simple calculation shows that 


(X)=XR= > oR. 


rEx 


The identity element 1p is crucial in this equation. In a commutative 
ring without identity we would also have to add on the additive group 
generated by X to XR and to Ngcx cR. 

If there exists a finite subset X = {x1,... ,2n} of R such that J = (X), 
then we say that I is finitely generated as an ideal of R. If I = (x) for an 
element x € R we say that I is the principal ideal generated by x. 


Example 1.1. Let R= Z, X = {4,6}, then (4,6) is finitely generated. In 
fact (4,6) contains 2-4—6 = 2 and it easily follows that (4,6) = (2), so 
that further this ideal is principal. More generally, every ideal of Z is of 
the form (n) for some n € N, hence principal. 


Example 1.2. Let R be the set Q under the usual operation of addition, 
but define a non-standard multiplication on R by setting zy = 0 for all 
x,y € R. The ideal (X) for a subset X C R is then equal to the abelian 
group generated by X under addition. Now R is an ideal of R, but is not 
finitely generated. To see this, suppose that R is generated as an abelian 
group by elements p;/qi,...,2n/Qn- Then the only primes dividing the 
denominators of elements of R will be those dividing qi,... ,@n, which is a 
contradiction. 


If K is a field and R is a subring of K then R is a domain. Conversely, 
every domain D can be embedded in a field L; and there exists such an 
L consisting only of elements d/e where d,e € D and e £ 0. Such an L, 
which is unique up to isomorphism, is called the field of fractions or field 
of quotients of D. (See Fraleigh [25] Theorem 26.1 p. 239.) 


Theorem 1.3. Every finite integral domain is a field. 


Proof: Let D be a finite integral domain. Since 1 4 0, then D has at 
least 2 elements. For 0 4 x € D the elements zy, as y runs through D, 
are distinct; for if zy = xz then x(y — z) = 0 and so y = z since D has 
no zero-divisors. Hence, by counting, the set of all elements zy is D. Thus 
1 = zy for some y € D, and therefore D is a field. O 
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Every field has a unique minimal subfield, the prime subfield, and this 
is isomorphic either to Q or to Z, where p is a prime number. (See Fraleigh 
[25] Theorem 29.7 p. 260; Stewart [71] Theorem 1.2, p. 4.) Correspond- 
ingly, we say that the characteristic of the field is 0 or p. In a field of 
characteristic p we have px = 0 for every element x, where as usual we 
write 


pz =(14+1+...+1)z 


where there are p summands 1; and p is the smallest positive integer with 
this property. In a field of characteristic zero, if nx = 0 for some non- 
zero element z and integer n, then n = 0. Our major concern in the 
sequel will be subfields of C (the complex numbers), which of course have 
characteristic zero, but fields of prime characteristic will arise naturally 
from time to time. 

We shall use without further comment the fact that C is algebraically 
closed: given any polynomial p over C there exists x € C such that 
p(x) = 0. For a proof of this see Stewart [71] pp. 193 ff.; different proofs 
using analysis or topological considerations are in Hardy [34] p. 492 and 
Titchmarsh [77] p.118. 


1.2 Factorization of Polynomials 


Later in the book we shall consider factorization in a more general con- 
text. Here we concentrate on factorizing polynomials. First a few general 
remarks. 

In a ring S, if we can write a = bc for a,b,c € S, then we say that b,c 
are factors of a. We also say ‘b divides a’ and write 


b| a. 
For any unit e € S we can always write 
a= e(e‘a), 
so, trivially, a unit is a factor of all elements in S. If a = be where neither 
b nor c is a unit, then b and c are called proper factors and a is said to be 
reducible. In particular 0 = 0-0 is reducible. 


Note that if a is itself a unit and a = bc, we have 


1=aa"! = bea}, 
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so b and c are both units. A unit cannot have a proper factorization. We 
therefore concentrate on factorization of non-units. A non-unit a € S is 
said to be irreducible if it has no proper factors. 

Now we turn our attention to the case S = Rit], the ring of polynomials 
in an indeterminate t with coefficients in a ring R. The elements of R[E| 
are expressions 


Prt” + ryt 1 +...4rit+ro 


where r9,71,---,7 € R and addition and multiplication are defined in the 
obvious way. (For a formal treatment of polynomials, and why not to use 
it, see Fraleigh [25] pp. 263-265.) 

Given a non-zero polynomial 


pH=Ttnt”+...+70, 


we define the degree of p to be the largest value of n for which r,, 4 0, and 
write it Op. Polynomials of degree 0,1,2,3,4,5,..., will often be referred 
to as constant, linear, quadratic, cubic, quartic, quintic, ... , polynomi- 
als respectively. In particular a constant polynomial is just a (non-zero) 
element of R. 

If R is an integral domain, then 


Opq = Op + 0q 
for non-zero p, q so Rit] is also an integral domain. If p = ag in R{t], then 
Op = 0a + Og implies that 
Og < Op. 


When R& is not a field, then it is perfectly possible to have a non-trivial 
factorization in which Op = Og. For example 


3t? + 6 = 3(t? + 2) 


in Z[t], where neither 3 nor ¢? + 2 is a unit. This is because of the existence 
of non-units in R. However, if R is a field, then all (non-zero) constants 
in R[t] are units and so if q is a proper factor of p for polynomials over a 
field, then Og < Op. 

Let us concentrate for a time on polynomials over a field K. Here we 
have the division algorithm which states that if p,q #0 then 


p=qst+r 


where either r = 0 or Or < Oq. The proof is by induction on Op and in 
practice is no more than long division of p by g leaving remainder r, which 
is either zero (in which case gq | p) or of degree lower than q. 
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The division algorithm is used repeatedly in the Euclidean algorithm, 
which is a particularly efficient method for finding the highest common 
factor d of non-zero polynomials p, q. 

This is defined by the properties: 

(a) d|p,d|q, 

(b) If d’ | p and d’ | q then d’ | d. 

These define d uniquely up to non-zero constant multiples. To calculate 
d we first suppose that p,q are named so that Op > Oq; then divide gq into 
p to get 


p=qitr Or, < Og < Op, 
and continue in the following way: 


q=7182+T2 Org < Or, 
T, = 7283 +73 Or3 < Ore 


Tr-2=Tn-18n t+ Tn On < OTn-1 
until we arrive at a zero remainder: 
Tn—-1 = Tn$n4+1- 


The last non-zero remainder r,, is the highest common factor. (From the 
last equation r, | Tn-1, and working back successively, r, is a factor of 
Tn—2;+++,T1,P, 4, verifying (a). If d’|p,d’|q, then from the first equation, 
d’' is a factor of r; = p— qs1, and successively working down the equations, 
d' is a factor of ra,r3,...,Tn, So d’|Tn, verifying (b).) Beginning with 
the first equation, and substituting in those which follow, we find that 
r; = ap + big for suitable a,,b; € K[é], and in particular the highest 
common factor d = ry is of the form 


d=ap+bq for suitable a,b € K{E]. (1.1) 


A useful special case is when d = 1, when p,q are called coprime and (1.1) 
gives 


ap+bq=1 for suitable a,b € K[t]. 


This technique for calculating the highest common factor can also be used 
to find the polynomials a, b. 


Example 1.4. p=#2+1,g=#4+1€ Qi]. 


16 1. Algebraic Background 


Then 
®+1 = 2t(t?+1)+(-t+1), 
@+1 = (-t-1)(-t+1)+2, 
—t+1 = (—1¢+4)2. 


The highest common factor is 2, or up to a constant factor, 1, so p and q 
are coprime, and substituting back from the second equation, 


1=4(¢?+1)+4(¢+1)(-t4+D). 
Then substituting for —t + 1 using the first equation: 


1 = 3(?+1)+3¢+1)(@ +1) -t@ +1) 
= (-1—-1¢4+1)(? +1) + (4t+))(8 +1). 


Factorizing a single polynomial p is by no means as straightforward as 
finding the highest common factor of two. It is known that every non-zero 
polynomial over a field K is a product of finitely many irreducible factors, 
and these are unique up to the order in which they are multiplied and up 
to constant factors. (See Fraleigh [25] Theorem 31.8 p.284; Stewart [71] 
pp. 19-21.) Finding these factors is very much an ad hoc matter. Linear 
factors are easiest, since (a — a) | p if and only if p(a) = 0. 

If p(a) = 0, then a is called a zero of p. If (t —a)™ | p where m > 2, 
then a is a repeated zero and the largest such m is the multiplicity of a. 

To detect repeated zeroes, we use a method which (like much in this 
chapter) was far more familiar at the turn of the century than it is now. 
Given a polynomial 


nr 
f= Sone 
i=0 
over a ring R we define 
Lis e 
Dia > rit, 
i=0 


called for obvious reasons the formal derivative of f. It is not hard to check 
directly that 


D(f +9) 
D(fg) 


Df + Dg 
(Df)g + f (Dg). 
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This enables us to check for repeated factors. A factor q of a polynomial p 
is repeated if g” | p for some r > 2. In particular q is repeated if its square 
divides p. 


Theorem 1.5. Let K be a field of characteristic zero. A non-zero polyno- 
mial f over K is divisible by the square of a polynomial of degree > 0 if 
and only if f and Df have a common factor of degree > 0. 


Proof: First suppose f = gh. Then 
Df = 9?Dh+29(Dg)h 


and so f and Df have g as a common factor. 
Now suppose that f has no squared irreducible factor. Then for any 
irreducible factor g of f we find 


f=gh 


where g and h are coprime (otherwise g would be a factor of h and would 
occur as a squared factor in gh). Were f and Df to have a common factor g, 
which we take to be irreducible, then on differentiating formally we would 
obtain 


Df = (Dg)h + g(Dh) 


So g is a factor of (Dg)h, hence of Dg because g and h are coprime. But 
Dg is of lower degree than g, hence it can only have g as a factor if Dg = 0. 
Since K has characteristic zero, by direct computation, this implies g is 
constant, so f and g can have no non-trivial common factor. O 


Remark. If the field has characteristic p > 0, then the first part of 
Theorem 1.5, that f having a squared factor implies f and Df have a 
common factor, is still true, and the proof is the same as above. 


A result which we shall need later is: 


Corollary 1.6. An irreducible polynomial over a subfield K of C has no 
repeated zeros in C. 


Proof: Suppose f is irreducible over K. Then f and Df must be coprime 
(for a common factor would be a squared factor of f by (1.5), and f is 
irreducible). Thus there exist polynomials a,b over K such that af + 
bDf = 1, and the same equation interpreted over C shows f and Df to be 
coprime over C. By Theorem 1.5 again, f cannot have repeated zeros. LU 
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We shall often consider factorization of polynomials over Q. When such 
a polynomial has integer coefficients we shall find that we need consider only 
factors which themselves have integer coefficients. This fact is enshrined in 
a result due to Gauss: 


Lemma 1.7. Let p € Z[t], and suppose that p = gh where g,h € Qt]. Then 
there exists X € Q,A #0, such that Ag, \~*h € Zé]. 


Proof: Multiplying by the product of the denominators of the coefficients 
of g, h we can rewrite p = gh as 


np = g'h' 


where g’,h’ are rational multiples of g,h respectively, n € Z and g’,h’ € 
Zit]. This means that n divides the coefficients of the product g’h’. We 
now divide the equation successively by the prime factors of n. We shall 
establish that if k is a prime factor of n, then k divides all the coefficients 
of g’ or all those of h’. Whichever it is, we can divide that particular 
polynomial by & to give another polynomial with integer coefficients. After 
dividing in this way by all the prime factors of n, we are left with 


p=gh 


where g,h € Z[t] are rational multiples of g, h respectively. Putting g = »g 
for \ € Q, we obtain h = X~'A and the result will follow. 
It remains to prove that if 


g = gtgit.-.t+grt" 
ho = hothyt+...+het® 


and a prime k divides all the coefficients of g’h’, then k must divide all the 
g: or all the h;. But if a prime k does not divide all the g; and all the h,;, 
we can choose the first of each set of coefficients, say 9m, hg which are not 
divisible by k. Then the coefficient of t“*? in the product g’h’ is 


Gorm+q + 91hm+q—1 +--+ + Gmhg +---Gm+qho 


and since every term in this expression is divisible by k except hgGm, this 
would mean that the whole coefficient would not be divisible by k, a con- 
tradiction. O 


We shall need methods for proving irreducibility of various specific poly- 
nomials over Z. The first of these is known as Hisenstein’s criterion: 


1.2. Factorization of Polynomials 19 


Theorem 1.8. Let 
f=ajptaytt+...+ ani” 
be a polynomial over Z. Suppose there is a prime q such that 


(a) df an, 
(b) q | ai, (¢=0,1,...,n—-1), 
(c) q? f ao. 


Then, apart from constant factors, f is irreducible over Z, and hence irre- 
ducible over Q. 


Proof: By Lemma 1.7 it is enough to show that f can have only constant 
factors over Z. 
If not, then f = gh where 


g=gotgitt+...+9,t" 
h=hothit+...+hgt* 


with all 9;,h; € Zandr,s>1,r+s=n. 

Now goo = ao, so (b) implies g divides one of go, ho whilst (c) implies 
it cannot divide both. Without loss in generality, suppose q divides go but 
not ho. Not all g; are divisible by g because this would imply that q divides 
Gn, contrary to (a). Let gm be the first coefficient of g not divisible by gq. 
Then 


Om = Gohm +---+ Gmho 


where m <r <n. All the summands on the right are divisible by q¢ except 
the last, which means that a, is not divisible by g, contradicting (b). O 


A second useful method is reduction modulo n, as follows. Suppose 
0 # p € Zit], with p reducible: say p = gr. The natural homomorphism 
Z — Z,, gives rise to a homomorphism Z|t] > Z,,[¢]. Using bars to denote 
images under this map, we have p> = gr. If Op = Op, then clearly 0g = Og, 
OF = Or, and 7 is also reducible. This proves: 


Theorem 1.9. If p € Z[t] and its image p € Z,|t] is irreducible, with Op = 
Op, then p is irreducible as an element of Z[t]. O 


In practice we take n to be prime, though this is not essential. The 
point of reducing modulo n is that Z, being finite, there are only a finite 
number of possible factors of p to be considered. 
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Examples. 


1.1 The polynomial ¢? — 2 satisfies Eisenstein’s criterion with g = 2. 


1.2 The polynomial t!! —7t® +21t° +49t—56 satisfies Kisenstein’s criterion 
with g = 7. 


1.3 The polynomial ¢® — t+ 1 does not satisfy Eisenstein’s criterion for 
any g. Instead we try reduction modulo 5. There is no linear factor 
since none of 0, 1, 2, 3, 4 yield 0 when substituted for ¢, so the only 
possible way to factorize is 


®—-t+1=(P+at+ f+ 7? +6t+6) 


where a, 3,y,6,€ take values 0, 1, 2, 3 or 4 (mod 5). This gives a 
system of equations on comparing coefficients: there are only a finite 
number of possibilities all of which are easily eliminated. Hence the 
polynomial is irreducible mod 5, so irreducible over Z. 


1.3 Field Extensions 


In finding the zeros of a polynomial p over a field K it is often necessary to 
pass to a larger field Z containing K. In these circumstances, L is called 
a field extension of K. For example, p(t) = t? + 1 has no zeros in R, but 
considering p as a polynomial over C, it has zeros +i and a factorization 


p(t) = (t+ 4)(t — 4). 
Field extensions often arise in a slightly more general context as a 
monomorphism 7 : K — L where K and L are fields. We shall see such 
instances shortly. It is customary in these cases to identify K with its im- 


age j(K), which is a subfield of L; then a field extension is a pair of fields 
(K,L) where K is a subfield of L. We talk of the extension 


L:K 


of K. Most field extensions with which we shall deal will involve two 
subfields of C. 

If L : K is a field extension, then L has a natural structure as a vector 
space over K (where vector addition is addition in L and scalar multipli- 
cation of A € K on v € L is just Av € L). The dimension of this vector 
space is called the degree of the extension, or the degree of L over K, and 
written 

[L: K| 


The degree has an important multiplicative property: 
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Theorem 1.10. If H C K CE are fields, then 
[L: H] =|[L: K][K: H). 


Proof: We sketch this. Details are in Stewart [71], Theorem 4.2 p. 50. Let 
{ai};c, be a basis for L over K, and {b;},. a basis for K over H. Then 


{abs} eres is a basis for L, over H. D0 


If [L : K] is finite we say that L is a finite extension of K. 

Given a field extension L:K and an element a € L, there may or may 
not exist a polynomial p € K[é] such that p(a) = 0, p # 0. If not, we 
say that a is transcendental over K. If such a p exists, we say that a is 
algebraic over K. If a is algebraic over K, then there exists a unique monic 
polynomial g of minimal degree subject to g(a) = 0, and q is called the 
minimum polynomial of a over K. (A monic polynomial is one with highest 
coefficient 1.) The minimum polynomial of a is irreducible over K. (These 
facts are to be found in Stewart [71] pp. 38, 39.) 

If ay,...Qn, € L, we write 


K(ay,... ,Qn) 
for the smallest subfield of E containing K and the elements aj,... , Qn. 
In an analogous way, if S is a subring of a ring R and aj,... ,a, € R, 
we write 
S[ai,... , a] 
for the smallest subring of R containing S and the elements aj,... , Qn. 
Clearly S[a1,... ,a@,] consists of all polynomials in a1,... , a, with coeffi- 


cients in S. For instance S[a] consists of polynomials 
Sotsiat+...+s ,a™ (s; € S). 


The case of K(q) is more interesting. If a is transcendental over K, then 
for km 4 0 we have 


ko that...+kna” £0 (ki € K). 
In this case K(a) must include all rational expressions 


So +51@+...+ 5,0” 


ee i,k; € Ky km 
ko tkhiat+t...+kma™ (sj, ki € ~y) 


and clearly consists precisely of these elements. 
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However, for a algebraic, we have: 


Theorem 1.11. If L : K is a field extension and a € L, then a is al- 
gebraic over K if and only if K(a) is a finite extension of K. In this 
case, [K(a):K] = Op where p is the minimum polynomial of a over K, and 
K(a) = K[al. 


Proof: Once more we sketch the proof, given in full in Stewart [71] Propo- 
sition 4.3, p. 52. If [K(a) : K] =n < oo then the powers, 1,a,a7,... ,a” 
are linearly dependent over K, whence a is algebraic. Conversely, suppose 
a algebraic with minimum polynomial p of degree m. We claim that K (a) 
is the vector space over K spanned by 1,a,... ,@~1. This (call it V) is 
certainly closed under addition, subtraction, and multiplication by a; for 
the last statement note that a” = —p(a) + a” = g(a) where 0g < m. 
Hence V is closed under multiplication, and so forms a ring. All we need 
prove now is that if0 4 v € V then 1/v € V. Now v = h(a) where h € K[E] 
and 0h < m. Since p is irreducible, p and h are coprime, so there exist 
f,g € K[t] such that 


Fi)p(t) + g(t)A(t) = 1. 


Then 
1= f(a)p(a) + 9(a)h(a) = g(a)h(a) 


so that 1/v = g(a) € V as required. But it follows at once that [K(a) : K] = 
dimgkV = m. oO 


If we specify in advance K and an irreducible monic polynomial p(¢) € 
K[t] then there exists up to isomorphism a unique extension field ZL such 
that L contains an element a with minimum polynomial p, and L = K(a). 
This can be constructed as K[é]/(p). It is customary to express this con- 
struction by the phrase ‘adjoin to K an element a with p(a) = 0’ and 
to write K(a) for the resulting field. This, and much else, is discussed in 
Stewart [71] Chapter 3, pp. 33-45. 


1.4 Symmetric Polynomials 


Let R[ti,te,...,t,] denote the ring of polynomials in indeterminates 
t1,t2,...,t, with coefficients in a ring R. Let S, denote the symmet- 
ric group of permutations on {1,2,...,n}. For any permutation 7€S,, and 


1.4. Symmetric Polynomials 23 


any polynomial f € R[t1,... ,t,] we define the polynomial f” by 


f"(h, vee itn) i f(trq); ore rtr(n))- 


For example if f = ¢; + tgt3 and 7 is the cycle (123) then f” = to + égty. 
The polynomial f is symmetric if f" = f for alla € S,. For example 
ti+...+t, is symmetric. More generally we have the elementary symmetric 
polynomials 


Sp(é1,--- ytn) (l<r<n) 


defined to be the sum of all possible distinct products of r distinct t,’s. 
Thus 


81(t1,--- ,tn) = ti tta+...+t,, 
So(ti,... stn) = lotthtig+...ttetg3t+...t+tn-itn, 
Sa (ti, =< ytn) = tite...tn. 


These arise in the following circumstances: consider a polynomial of degree 
n over a subfield K of C, 


f =a,t° +...+ 40, 
and resolve it into linear factors over C: 
f =a,(t — a1)... (€-— ap). 
Then, expanding the product, we find 


f=a,( — at?) +...+(—1)sn), 


where s; denotes s;(a1,... , Qn). 

A polynomial in s),...,s, can clearly be rewritten as a symmetric 
polynomial in ¢,,... ,£,. The converse is also true, a fact first proved by 
Newton: 


Theorem 1.12. Let R be a ring. Then every symmetric polynomial in 
Riti,...,t,] is expressible as a polynomial with coefficients in R in the 
elementary symmetric polynomials s1,... , Sn. 


Proof: We shall demonstrate a specific technique for reducing a symmetric 
polynomial into elementary ones. First we order the monomials ¢{* ...£2" 
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by a ‘lexicographic’ order in which t{* ...¢%" precedes e 1... 2m if the first 
nonzero a; — 3; is positive. Then given a polynomial p € R[ti,... ,tn], we 
order its terms lexicographically. If p is symmetric, then for every mono- 
mial at{* ...¢2" occurring in p, there occurs a similar monomial with the 
exponents permuted. Let a, be the highest exponent occurring in mono- 
mials of p: then there is a term containing t[!. The leading term of p 
in lexicographic ordering contains t7!, and among all such monomials we 
select the one with the highest occurring power of t2 and so on. In partic- 
ular, the leading term of a symmetric polynomial is of the form at? ...t2" 
where @| >... > Q,. For example, the leading term of 


sk. .gkn — (t+... 4+tp)... (tr...) 
is 
peaeia Raga erin “oe. thn . 


By choosing ky = a1 — Q9,...,kn—-1 = Qn—-1 — On, kn = Opn (which is 
possible because a1 >... > @,), we can make this the same as the leading 
term of p. Then 


pots Oakes eet 
has a lexicographic leading term 
bth... t8n (B, >... > Br) 


which comes after aty’ ...¢2" in the ordering. But only a finite number of 
monomials fj’ ... 7" satisfying 71 >... > Yn follow t{? ...t2" lexicograph- 
ically, and so a finite number of repetitions of the given process reduce p 
to a polynomial in s1,... , Sn. O 


Example 1.13. The symmetric polynomial 


p = thtg + tts + t1t2 + t1t3 + tts t+ tot? 


is written lexicographically. Here n = 3, aj = 2, ag = 1, ag = 0 and the 
method tells us to consider 


p— 8182. 
This simplifies to give 


p— 8182 = 3ty tats. 
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The polynomial 3¢t2t3 is visibly 3s3, but the method, using a; = a2 = 
a3 = 1, also leads us to this conclusion. 


This result about symmetric functions proves to be extremely useful in 
the following instance: 


Corollary 1.14. Suppose that L is an extension of the field K, p € 
K[t], Op = n and the zeros of p are 61,...,0, € L. If h(ti,...,tn) € 
K[ti,... ,tn] 1s symmetric, then h(61,...,4n) € K. 


1.5 Modules 


Let R be a ring. By an R-module we mean an abelian group M (written 
additively), together with a function a: R x M — M, for which we write 
a(r,m) =rm (r € Rjm € M), satisfying 


(a) (r+s)m = rm+sm, 
(b) r(m+n) = rm+rn, 
(c) r(sm) = = (rs)m, 
(d) lm ieee 


for allr,s Ee R, mne M. 

(Although (d) is always obligatory in this text, be warned that in other 
parts of mathematics it may not be required to be so.) The function a is 
called an R-action on M. 

If Ris a field K, then an R-module is the same thing as a vector space 
over K. In this sense one may think of an R-module as a generalization of 
a vector space; but because of the lack of division in R, many of the tech- 
niques in vector space theory do not carry over as they stand to R-modules. 
The basic theory of modules may be found in Fraleigh [25] section 37.2, 
p. 338. In particular we define an R-submodule of M to be asubgroup N of 
M (under addition) such that ifn € N, r € R, then rn € N. We may then 
define the quotient module M/N to be the corresponding quotient group, 
with R-action 


r(N+m)=N+rm (re R,meM). 
If X C M,Y C R,we define YX to be the set of all finite sums 7, yiz; 


where y; € Y, x; € X. 
The submodule of M generated by X, which we write 


(X)R, 


is the smallest submodule containing X. This is equal to RX. If N = 
({21,.-.,2n)R then we say that N is a finitely generated R-module. 
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A Z-module is nothing more than an abelian group M (written addi- 
tively), and conversely, given an additive abelian group M we can make it 
into a Z-module by defining 


Om=0, lm=m (meM) 
then inductively 
(n+1)m=nm+m (n€Z,n>0) 
and 
(-—n)m=-nm (n€Z,n> 0). 


We shall discuss this case further in the next section. 
More generally there are several natural ways in which R-modules can 
arise, of which we distinguish three: 


1. Suppose R is a subring of a ring S. Then S is an R-module with 
action 


a(r,s)=rs (réR,s€S) 
where the product is just that of elements in S. 
2. Suppose J is an ideal of the ring R. Then J is an R-module under 
a(r,i)=ri (reRiel) 
where the product is that in R. 


3. Suppose J C I is another ideal: then J is also an R-module. The 
quotient module I/J has the action 


r(J+i=Jt+ri (reRiel). 


1.6 Free Abelian Groups 


The study of algebraic numbers in this text will be carried out not only in 
subfields of C, but also will require properties of subrings of C. A typical 
instance might be the subring 


Zi] = {a+ bi € C | a,b € Z}. 


Considering the additive group of Z[i], we find that it is isomorphic to 
Z x Z. More generally the additive group of those subrings of C that 
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we shall study will usually be isomorphic to the direct product of a finite 
number of copies of Z. In this section we study the properties of such 
abelian groups which will prove of use later in this text. 

Let G be an abelian group. In this section we shall use additive notation 
for G, so the group operation will be denoted by ‘ + ’ the identity by 0, 
the inverse of g by —g and powers of g by 2g9,3g,.... In later chapters 
we shall encounter cases where multiplicative notation is more appropriate 
and expect the reader to make the transition without undue fuss. 

If G is finitely generated as a Z-module, so that there exist 9g1,...,9n € 
G such that every g € G is a sum 


9 =™191 +... + MnGn (m; € Z) 
then G is called a finitely generated abelian group. 
Generalizing the notion of linear independence in a vector space, we say 
that elements gi,...,9, in an abelian group G are linearly independent 


(over Z) if any equation 


M191 +... +Mngn = 0 


with mj,...,mn € Z implies m; =... = m, = 0. A linearly independent 
set which generates G is called a basis (or a Z-basis for emphasis). If 
{91,.-- ,9n} is a basis, then every g € G has a unique representation: 


g9=mM191 +---+Mngn (m; € Z) 
because an alternative expression 
9=hkigit...+kngn (k;, € Z) 
implies 
(m1 — ki)g1 +... + (mn — kn) gn = 0 


and linear independence implies m; = ki, (1 <i <n). 

If Z" denotes the direct product of n copies of the additive group of 
integers, it follows that a group with a basis of n elements is isomorphic to 
Zz”. 

To show that two different bases of G have the same number of elements, 
let 2G be the subgroup of G consisting of all elements of the form g + g 
(g € G). If G has a basis of n elements, then G'/2G is a group of order 2”. 
Since the definition of 2G does not depend on any particular basis, every 
basis must have the same number of elements. 


28 1. Algebraic Background 


An abelian group with a basis of n elements is called a free abelian group 
of rank n. If G is free abelian of rank n and {x1,... ,¢n}, {y1,--- , Yn} are 
both bases, then there exist integers a,;,b;; such that 


v= So aiz;, r= >> bijys- 
j 


] 


If we consider the matrices 
A = (ais), B= (bij) 
it follows that AB = [,,, the identity matrix. Hence 
det(A)det(B) = 1 
and since det(A) and det(B) are integers, we must have 
det(A) = det(B) = +1. 


A square matrix over Z with determinant +1 is said to be unimodular. We 
have: 


Lemma 1.15. Let G be a free abelian group of rank n with basis {x1,... , Zn}. 
Suppose (a;;) is ann x n matrix with integer entries. Then the elements 


y= 5 O4jXj 
i 


form a basis of G if and only if (a;;) is unimodular. 


Proof: The ‘only if’ part has already been dealt with. Now suppose 
A = (a;;) is unimodular. Since det(A) 4 0 it follows that the y; are 
linearly independent. We have 


A! = (det(A)) 1A 
where A is the adjoint matrix and has integer entries. Hence A~! = +A 


has integer entries. Putting B = A~1 = (bij) we obtain 2; = Dy bis Ys: 
demonstrating that the y; generate G. Thus they form a basis. O 


The central result in the theory of finitely generated free abelian groups 
concerns the structure of subgroups: 


Theorem 1.16. Every subgroup H of a free abelian group G of rank n is 
free of rank s < n. Moreover there exists a basis u,...,Un for G and 
positive integers 01,...,Q@ 5 such that a1u1,...,Q@5Us is a basis for H. 
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Proof: We use induction on the rank n of G. For n = 1, G is infinite 
cyclic and the result is a consequence of the subgroup structure of the 
cyclic group. If G is rank n, pick any basis w1,... , Ww, of G. Every h € H 
is of the form 


h=hywit+...t+hnwn. 


Either H = {0}, in which case the theorem is trivial, or there exist non-zero 


coefficients h; for some h € H. From all such coefficients, let A(wi,... , Wn) 
be the least positive integer occurring. Now choose the basis w1,... ,Wn,; 
to make A(w1,... , Wn) minimal. Let a, be this minimal value, and number 


the w; in such a way that 
Vy = aw, + Bowe t+... + BnWn 
is an element of H in which a occurs as a coefficient. Let 
B= 1G +7: (2<i<n) 
where 0 < rj < a, so that r; is the remainder on dividing G; by a;. Define 
Uy = Wy + qoW2+...-+QnWn- 


Then it is easy to verify that ui, we,...,Wnr is another basis for G. (The 
appropriate matrix is clearly unimodular.) With respect to the new basis, 


Vy = AU, +TQWo+...+TrWn. 


By the minimality of a, = A(w1,... , Wn) for all bases we have 
T2=...=T, = 0. 
Hence 
Vy = aU}. 


With respect to the new basis, let 
H' = {mu + ™MoWw2+..- + MpnWn | my, = O}. 


Clearly H’NV; = {0}, where Vj is the subgroup generated by v1. We claim 
that H = H'+V,. For if h € H then 


h= ita + yowe t+... + %nWn 
and putting 


Y= o1gt+ri (O< ri < a1) 
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it follows that H contains 
h—qvy =71U, + 72Wo+---+InWn 


and the minimality of a; once more implies that r; = 0. Hence h—qv, € H’. 
It follows that H is isomorphic to H’ x V; and H’ is a subgroup of the 


group G’ which is free abelian of rank n — 1 with generators we,... , Wn- 
By induction, H’ is free of rank < n — 1, and there exist bases uo,... , Un 
of G’ and v2,... ,vs of H’ such that v; = a;u, for positive integers a;. The 
result follows. O 


From the above two results we can deduce a useful theorem about orders 
of quotient groups. In its statement we use |X| to denote the cardinality 
of the set X, and |z| to denote the absolute value of the real number x. No 
confusion need arise. 


Theorem 1.17. Let G be a free abelian group of rank r, and H a subgroup 
of G. Then G/H is finite if and only if the ranks of G and H are equal. 
If this is the case, and if G and H have Z-bases 11,...,2, and y1,... 5 Yr 
with Y= pF QjjZ5, then 


|G/H| = |det(a,;)|- 


Proof: Let H have rank s, and use Theorem 1.16 to choose Z-bases 
U1,.-.,U, of G and v,... ,u, of H with v; = a;u; for 1 <i < s. Clearly 
G/H is the direct product of finite cyclic groups of orders a1,... ,a@, and 
r — s infinite cyclic groups. Hence |G/H| is finite if and only if r = s, and 
in that case 


IG/H| = a1...a,. 


Now we have 
Ww= > bij; 
j 
do eats 
j 
w= ys G50; 
j 


where the matrices (b;;) = B and (d,;) = D are unimodular by Lemma 
1.15, and 


Vi 
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Ol, 
Clearly if A = (a;;) we have A = BCD, and hence 
det(A) = det(B)det(C)det(D). 
So 
|det(A)| = | + 1||det(C)]| + 1] = jor ...a,| = |G/H| 
as claimed. oO 


For example, if G has rank 3 and Z-basis x,y,z; and if H has Z-basis 


3a + y — 2z, 
4z —5y+2, 
£ + Tz, 


then |G/H| is the absolute value of 


3 1 -2 

4 -5 11], 

1 0O 7 
namely 142. 

Suppose now that G is a finitely generated group, and let its generators 
be wy 1,...,Wn, where the latter need not be independent. Then we can 
define a map f : Z” — G by: 

f (mi,... Mn) = M101 +... + Mn Wn. 


This is surjective, so G is isomorphic to Z"/H where H is the kernel of f. 
We can use Theorem 1.16 to choose a new basis u1,... , Un of Z” in such 
a way that a1u1,...,Q@s5U,s is a basis for H. Let A be the subgroup of Z” 
generated by u1,... ,u,; and B be the subgroup generated by us41,... , Un, 
then clearly G is isomorphic to (A/H) x B, and so is the direct product of 
a finite abelian group A/H and a free group B on n—s generators. Putting 
n—s8s=k, we have: 
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Proposition 1.18. Every finitely generated abelian group with n generators 
is the direct product of a finite abelian group and a free group on k gener- 
ators where k <n. 


If K is any subgroup of a finitely generated abelian group G, then 
writing G = F x B where F is finite and B is finitely generated and free, 
we find K © (FN K) x H where H C B. Then F'1 K is finite and (by 
Theorem 1.16) H is finitely generated and free, so we find K is finitely 
generated. Hence we have: 


Proposition 1.19. A subgroup of a finitely generated abelian group is finitely 
generated. oO 


Of course the results in this section are not the best possible that can 
be proved in finitely generated abelian group theory. Refinements may 
be found in Fraleigh [35] chapter 9 pp. 86-93. The results that we have 
established are ample for our needs in this text, and we will delay no longer 
in making a start on the substance of algebraic number theory. 


1.7 Exercises 


1. Show that Theorem 1.3 becomes false if the word ‘finite’ is omitted 
from the hypotheses. 


2. Which of the following polynomials over Z are irreducible? 
(a) 27+3 
(b) x? — 169 
(c) a +22?+241 
(d) 2° + 2x? + 34 +4. 


3. Write down some polynomials over Z and factorize them into irre- 
ducibles. 


4. Does Theorem 1.5 remain true over a field of characteristic p > 0? 


5. Find the minimum polynomial over Q of 
(i) (1+4)/v2 
(ii) i+ 2 
(iii) e?7*/3 + 2. 


1.7. 


6. 


10. 


11. 
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Find the degrees of the following field extensions: 
(a) Q(v7) :Q 

(b) C(V7): C 

(c) Q(v5, V7, V35) :Q 

(d) R(@) : R where 6* —-70+6=O and O¢R. 
(e) Q(x): Q. 


. Let K be the field generated by the elements e?7*/" (n = 1,2,...). 


Show that K is an algebraic extension of Q, but that [K : Q] is not 
finite. (It may help to show that the minimum polynomial of e2ti/p 
for p prime is t?—1 4+ #?-?+...+1.) 


. Express the following polynomials in terms of elemtary symmetric 


polynomials, where this is possible. 


(a) @+84+8 (n = 3) 
(b) +6 (n = 2) 
(c) t1t2 + tot2 + t3t? (n =3) 
(d) i +2+8 (n = 3). 


. A polynomial belonging to Z[t1,... , tn] is said to be antisymmetric if 


it is invariant under even permutations of the variables, but changes 
sign under odd permutations. Let 


A= Il (t; - t;) : 
i<j 
Show that A is antisymmetric. If f is any antisymmetric polyno- 


mial, prove that f is expressible as a polynomial in the elementary 
symmetric polynomials, together with A. (Hint: consider f/A.) 


Find the orders of the groups G/H where G is free abelian with 
Z-basis x,y,z and H is generated by: 


(a) 2x, 3y, 7z 

(b) x + 3y — 5z, 2x — 4y, 7x + 2y — 9z 

(c) x 

(d) 41a + 32y — 999z, 16y + 3z, 2y + 111z 
(e) 41a + 32y — 999z. 


Let K be a field. Show that M is a K-module if and only if it is a 
vector space over K. Show that the submodules of M are precisely 
the vector subspaces. Do these statements remain true if we do not 
use convention (d) (page 25) for modules? 
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12. 


13. 


14. 


15. 
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Let Z be a Z-module with the obvious action. Find all the submod- 
ules. 


Let R be a ring, and let M be a finitely generated R-module. Is it true 
that M necessarily has only finitely many distinct R-submodules? If 
not, is there an extra condition on R which will lead to this conclu- 
sion? 


An abelian group G is said to be torsion-free if g € G, g # 0 and 
kg = 0 for k € Z implies k = 0. Prove that a finitely generated 
torsion-free abelian group is a finitely generated free group. 


By examining the proof of Theorem 1.16 carefully, or by other means, 
prove that if H is a subgroup of a free group G of rank n then there 
exists a basis u1,...,U, for G and a basis v1,...,v, for H where 
s <nand vu; = aju; (1 <i < s) where the a; are positive integers 
and a, divides aj41 (1<i<s-—1). 


2 


Algebraic Numbers 


In this chapter we introduce the algebraic numbers as solutions of polyno- 
mial equations with integer coefficients. Among these numbers, the major 
players are the solutions of equations with integer coefficients whose lead- 
ing coefficient is 1. These are the algebraic integers. We shall develop a 
theory of factorization of algebraic integers, analogous to factorization of 
whole numbers. In many ways the theories are alike, but in at least one es- 
sential way—uniqueness of factorization—there are important differences. 
Factorization into irreducible elements depends on the ring in which the 
factorization is performed. In Z the number 5 is irreducible. The only 
ways to write it as a product are trivial: multiply +5 and +1. However, in 
Z[V5] it can be written as the non-trivial product 5 = /5- V5; moreover, 
it turns out that /5 cannot be further factorized in this ring. Thus 5 is 
irreducible in Z, yet reducible in Z[V/5]. 

To clarify these issues it is therefore essential to specify in which ring 
the factorization is to be carried out. The natural context is a ring of 
algebraic integers, contained in its associated algebraic number field. We 
begin with algebraic number fields that obey a finiteness condition: they 
are finite-dimensional as vector spaces over the rationals. It will follow that 
such a field is of the form Q[6] for a single algebraic number 6. 

We introduce the conjugates of an algebraic number and the discrim- 
inant of a basis for Q[6] over Q, using the conjugates of @ to show that 
the discriminant is always a non-zero rational number. Algebraic integers 
are defined and shown to form a ring. The ring of algebraic integers in a 
number field is shown to have an integral basis whose discriminant is an 
integer. This integer is independent of the choice of integral basis and is 
called the discriminant of the number field. 
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Finally, we introduce the norm and trace of an algebraic number, which 
prove to be ordinary integers when the algebraic number is an algebraic 
integer. Using the norm and trace in later chapters we shall be able to 
translate statements about algebraic integers into statements about ordi- 
nary integers which are easier to handle. 


2.1 Algebraic Numbers 


A complex number a will be called algebraic if it is algebraic over Q, 
that is, it satisfies a non-zero polynomial equation with coefficients in Q. 
Equivalently (clearing out denominators) we may assume the coefficients 
to be in Z. We let A denote the set of algebraic numbers. In fact A is a 
field, by virtue of: 


Theorem 2.1. The set A of algebraic numbers is a subfield of the complex 
field C. 


Proof: We use Theorem 1.11, which in this case says that a is algebraic 
if and only if [Q(q) : Q] is finite. Suppose that a, 6 are algebraic. Then 


[Q (a, B) : Q] = [Q (a, B) : Q(a)] [Q (a) : Q] 


Now since f is algebraic over Q it is certainly algebraic over Q(a), so the 
first factor on the right is finite; and the second factor is also finite. Hence 
[Q(a, Z) : Q] is finite. But each of a+ 8, a — B, af, and (for 8 #0) a/f 
belongs to Q(a, 3). So all of these are in A, and the theorem is proved. 0 


The whole field A is not as interesting, for us, as certain of its subfields. 
We define a number field to be a subfield K of C such that [K : Q] is finite. 
This implies that every element of K is algebraic, and hence K C A. 
The trouble with A is that [A : Q] is not finite (see Chapter 1, Exercise 
7, or Stewart [71], Exercise 4.8, p. 55). If K is a number field then K = 
Q(ai,... ; Qn) for finitely many algebraic numbers aj,... , @n, (for instance, 
a basis for K as vector space over Q). We can strengthen this observation 
considerably: 


Theorem 2.2. If K is a number field then K = Q(6) for some algebraic 
number 6. 


Proof: Arguing by induction, it is sufficient to prove that if K = Ky(a, 8) 
then K = K,(@) for some 6, (where Ky is a sub-field of K). Let p and 
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q respectively be the minimum polynomials of a, 3 over K,, and suppose 
that over C these factorize as 


S 
— 
oo 
— 
| 


(t-—a1)...(6- an), 
q(t) = (t—fi)...(¢—-Bm), 


where we choose the numbering so that a, = a, 8; = GB. By Corollary 1.6 
the a; are distinct, as are the G;. Hence for each i and each k ¥ 1 there is 
at most one element x € K, such that 


a, + 26, =a, + £f}. 


Since there are only finitely many such equations, we may choose c £ 0 in 
Ky, not equal to any of these x’s, and then 


a; + CB, F a + chy 
forl1<i<n,2<k<m. Define 
d=a+cZ. 


We shall prove that K1(6) = Ki (a, 8). Obviously K1(6) C Ki(a, B), and 
it suffices to prove that @ € K,(0) since a = 0 — cf. 
Now 


p(0 — cB) = p(a) = 0. 
We define the polynomial 
r(t) = pO — ct) € Ki (6) [Et] 


and then @ is a zero of both q(t) and r(¢) as polynomials over K,(0). Now 
these polynomials have only one common zero, for if g(€) = r(€) = 0 then 
€ is one of G),... , 8, and also 0 — c€ is one of a1,... ,@,. Our choice of c 
forces € = 8. Let h(t) be the minimum polynomial of @ over K, (0). Then 
h(t) | q(t) and h(t) | r(¢). Since g and r have just one common zero in C 
we must have 0h = 1, so that 


h(t) =t+yp 


for u € Ki(0). Now 0 = h(8) = B+ yp so that 6 = —p € Ki (6) as required. 
O 
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Example 2.3. Q(/2, 75). 


We have 
a= V2, a2= —/2, 
on = V5, Bo =w¥5, 33 F w? V5 
where 


w = 4(-1+ vV-3) 
is a complex cube root of 1. The number c = 1 satisfies 
a, + chy Aa+cE 


for i = 1,2, k = 2,3; since the number on the left is not real in any of the 
four cases, whereas that on the right is. Hence Q(V2, 75) = Q(V2+ 7). 


The expression of K as Q(@) is, of course, not unique; for Q(@) = 


Q(—6) = Q(6+ 1) =... ete. 


2.2 Conjugates and Discriminants 


If K = Q(@) is a number field there will, in general, be several distinct 
monomorphisms o : K — C. For instance, if K = Q(i) where i = /—1 
then we have the possibilities 


oi(zt+iy) = r+iy, 
ox(et+iy) = z—iy, 


for x,y € Q. The full set of such monomorphisms will play a fundamental 
part in the theory, so we begin with a description. 


Theorem 2.4. Let kK = Q(0) be a number field of degree n over Q. Then 
there are exactly n distinct monomorphisms o, : K > C (i = 1,...,n). 
The elements o;(0) = 0; are the distinct zeros in C of the minimum poly- 
nomial of 6 over Q. 


Proof: Let 01,... ,@, be the (by Corollary 1.3 distinct) zeros of the mini- 
mum polynomial p of 9. Then each 0; also has minimum polynomial p (it 
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must divide p, and p is irreducible) and so there is a unique field isomor- 
phism o; : Q(6) > Q(6;) such that o;(6) = 0;. In fact, if a € Q(6) then 
a =r(@) for a unique r € Q[é] with Or < n; and we must have 


Oj (a) => r(6;). 


(See Garling [28] Corollary 2 to Theorem 7.4, p. 66; Stewart [71], Theorem 
3.8, p. 43.) Conversely if o : K — C is a monomorphism then a is the 
identity on Q. Then we have 


0 = o(p(9)) = v(o(9)) 


so that o(6) is one of the 6;, hence o is one of the o;. 0 


Keep this notation, and for each a € K = Q(@) define the field polyno- 
mial of a over K to be 


n 


fa(t) = [[(¢ - o:()). 


i=1 


As it stands, this is in K[é]. In fact more is true: 


Theorem 2.5. The coefficients of the field polynomial are rational numbers, 
so that f(t) € Qid]. 


Proof: We have a = r(6) for r € Q{t], Or < n. Now the field polynomial 
takes the form 


fo(t) = [[@-r()) 


a 


where the 6; run through all zeros of the minimum polynomial p of 6, whose 
coefficients are in Q. It is easy to see that the coefficients of f.(t) are of 
the form 


A(O1,.-- On) 


where h(t1,... ,t,) is a symmetric polynomial in Q[fi,... ,¢n]. By Corol- 
lary 1.14 the result follows. O 


The elements o;(a), for i=1,... ,n, are called the K-conjugates of a. 
Although the 6; are distinct (and are the K-conjugates of 6) it is not always 
the case that the K-conjugates of a are distinct: for instance o;(1) = 1 for 
all 1. The precise situation is given by: 
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Theorem 2.6. With the above notation, 
(a) The field polynomial fa is a power of the minimum polynomial pa, 
(b) The K-conjugates of a are the zeros of py in C, each repeated n/m 
times where m = Opaq is a divisor of n, 
(c) The element a € Q if and only if all of its K-conjugates are equal, 
(d) Q(a) = Q(4) if and only if all K-conjugates of a are distinct. 


Proof: The main point is (a). Now g = pq is irreducible, and a is a 
zero of f = f,, so that f = q*h where g and h are coprime and both are 
monic. (This follows from factorizing f into irreducibles.) We claim that h 
is constant. If not, some a; = o;(a) = r(9;) is a zero of h, where a = r(6). 
Hence if g(t) = h(r(t)) then 9(0;) = 0. Let p be the minimum polynomial 
of @ over Q, and hence also of each 6;. Then p|g, so that g(@;) = 0 for all 
j, and in particular g(@) = 0. Therefore, h(a) = h(r(6)) = g(@) = 0 and so 
q divides h, a contradiction. Hence h is constant and monic, so h = 1 and 
f=¢. 

(b) is an immediate consequence of (a) on referring to the definition of 
the field polynomial. 

To prove (c), it is clear that a € Q implies o;(a) € Q. Conversely if all 
o;(a) are equal then, since the zeros of g = pa are distinct and f, = q°, 
then 0g = 1 and soaeEQ. 

Finally for (d): if all o;(a) are distinct then Op, = n, and hence 
[Q(a): Q] = n =[Q(6) : Q]. This implies that Q(a) = Q(@). Conversely if 
Q(a) = Q(8) then dp, = n and so the o;(a) are distinct. O 


Warning. Note that the K-conjugates of a need not be elements of K. 
Even the 6; need not be elements of K. For example, let @ be the real cube 
root of 2. Then Q(@) is a subfield of R. The K-conjugates of 6, however, 
are 0, w0, w76, where w = $(—1+/—3). The last two of these are nonreal, 
hence do not lie in Q(6). 


Still with K = Q(6) of degree n, let {a;,... ,@n} be a basis of K (as 
vector space over Q). We define the discriminant of this basis to be 


A[oy,.-. ; Qn] = {det[o;(a;)]}?. (2.1) 


If we pick another basis {(1,... , Gn} then 
Bx = >> ino (ciz € Q) 
i=1 


fork =1,...,n, and 


det(cix) # 0. 
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The product formula for determinants, and the fact that the o; are monomor- 
phisms (and hence the identity on Q) shows that 


Alfi,--- Bn] = [det(ciz)|? Alon, ... , an). 


Theorem 2.7. The discriminant of any basis for K = Q(0) is rational and 
non-zero. If all K-conjugates of 0 are real then the discriminant of any 
basis is positive. 


Proof: First we pick a basis with which we can compute: the obvious one 
is {1,0,...,0%—1}. If the conjugates of 0 are 0,,... ,0, then 


A{1,0,... 0+] = (det 67). 


A determinant of the form D = det(t?) is called a Vandermonde determi- 
nant, and has value 


1<i<j<n 
To see this, think of everything as lying inside Q[ti,...,t,]. Then for 
t; = t; the determinant has two equal rows, so vanishes. Hence D is 
divisible by each (¢; — ¢;). To avoid repeating such a factor twice we take 
i<j. Then comparison of degrees easily shows that D has no other non- 
constant factors; comparing coefficients of t)t2...¢" gives 2.2. 
Hence 


A =All,6,... ,6"-] = [][@:— 9). 
Now D is antisymmetric in the t;, so that D? is symmetric. Hence by the 
usual argument on symmetric polynomials (Corollary 1.14), A is rational. 


Since the 0; are distinct, A £ 0. 
Now let {f1,... ,@n} be any basis. Then 


A[Bi,--- Bn] = (det ci,)?A 
for certain rational numbers c;,, and det(cj,) 4 0 so that 
Alfi,---Bn] £0, 


and is rational. Clearly if all @; are real then A is a positive real number, 
hence so is A[(i,... , Bn}. oO 


With the above notation, A vanishes if and only if some 6; is equal to 
another @;. Hence the non-vanishing of A allows us to ‘discriminate’ the 
6;, which motivates calling A the discriminant. 
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2.3 Algebraic Integers 


A complex number @ is an algebraic integer if there is a monic polynomial 
p(t) with integer coefficients such that p(@) = 0. In other words, 


O +a, 107 1 +...+4a9 =0 


where a; € Z for all i. 

For example, 6 = /—2 is an algebraic integer, since 6? + 2 = 0; 7 = 
1(1+ V5) is an algebraic integer, since 7? —7T —1 = 0. But $ = 22/7 
is not. It satisfies equations like 7@ — 22 = 0, but this is not monic; or 
like @ — 22/7 = 0, whose coefficients are not integers; but it can be shown 
without difficulty that ¢ does not satisfy any monic polynomial equation 
with integer coefficients. 

We write B for the set of algebraic integers. One of our aims is to prove 
that B is a subring of A. We prepare for this by proving: 


Lemma 2.8. A complex number 6 is an algebraic integer if and only if the 
additive group generated by all powers 1,0,07,... is finitely generated. 


Proof: If 6 is an algebraic integer, then for some n we have 


6" + an-10"-1 +... +49 =0 (2.3) 


where the a; € Z. We claim that every power of 9 lies in the additive 
group generated by 1,0,... ,6"—*. Call this group I. Then (2.3) shows 
that 6" €T. Inductively, if m > n and 6” €T then 


gmt — gmticngn — gmti-n(_q, 19%! —...— ag) ET. 


This proves that every power of @ lies in I, which gives one implication. 

For the converse, suppose that every power of @ lies in a finitely gen- 
erated additive group G. The subgroup I of G generated by the powers 
1,0,67,... ,@" must also be finitely generated (Proposition 1.19), so we 
will suppose that T has generators v1,... ,Un. Each v; is a polynomial in 
6 with integer coefficients, so 6v; is also such a polynomial. Hence there 
exist integers b,; such that 


60; = 3 bij 0;- 
j=1 
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This leads to a system of homogeneous equations for the v; of the form 


(b11 _ A)uy + bygtve+...+ dinvn = 0 
boiv1 + (boo — O)ve +... + bantn = 0 
bn1vi + bnave +... + (Onn = A)un = 0. 
Since there exists a solution v1,... ,U, € C, not all zero, it follows that the 
determinant 
bi —0 bie tee bin 
bay boo —-O ... dan 
bai bn2 eee ban = 6 


is zero. Expanding this, we see that 6 satisfies a monic polynomial equation 
with integer coefficients. Oo 


Theorem 2.9. The algebraic integers form a subring of the field of algebraic 
numbers. 


Proof: Let 0,¢ € B. We have to show that ¢+6 and 0¢ € B. By Lemma 
2.8 all powers of @ lie in a finitely generated additive subgroup I, of C, and 
all powers of ¢ lie in a finitely generated additive subgroup ['y. But now 
all powers of 6+ ¢ and of 0¢ are integer linear combinations of elements 
6°) which lie in ef, C C. But if Ig has generators v1,...,v, and I'y 
has generators w1,... ,Wm, then I'9I'g is the additive group generated by 
all vjxw; for 1 <i <n, 1<j<m. Hence all powers of 0+ ¢ and of 0¢ lie 
in a finitely generated additive subgroup of C, so by Lemma 2.8 6+ ¢ and 
0¢ are algebraic integers. Hence B is a subring of A. Oo 


A simple extension of this technique allows us to prove the following 
useful theorem. 


Theorem 2.10. Let 0 be a complex number satisfying a monic polynomial 
equation whose coefficients are algebraic integers. Then @ is an algebraic 
integer. 


Proof: Suppose that 
0 4+ n-107 1 4+...4+40 =0 


where w,--- ,%n—1 € B. Then these generate a subring V of B. The 
argument of Lemma 2.8 shows that all powers of @ lie inside a finitely 
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generated U-submodule M of C, spanned by 1,6,... ,6"-1. By Theorem 
2.9, each w; and all its powers lie inside a finitely generated additive group 
I’; with generators y;; (1 < 7 < n,). It follows that M lies inside the 
additive group generated by all elements 


Y1jis V2jos+++ > Vaan gen" 
(1< 9S m,0<i<n-1,0<k<n-1), which is a finite set. So M is 
finitely generated as an additive group, and the theorem follows. O 


Theorems 2.9 and 2.10 allow us to construct many new algebraic inte- 
gers out of known ones. For instance, /2 and V3 are clearly algebraic inte- 
gers. Then Theorem 2.9 says that numbers such as J2+ V3, 7/2 —41 V3, 
(/2)5(1 + V3)? are also algebraic integers. And Theorem 2.10 says that 
zeros of polynomials such as 


3 — (144 9/3)t9 + (/2)¢5 — 19V3 


are algebraic integers. It would not be easy, particularly in the last instance, 
to compute explicit polynomials over Z of which these algebraic integers are 
zeros; although it can in principle be done by using symmetric polynomials. 
In fact Theorems 2.9 and 2.10 can be proved this way. 

For any number field K we write 


O=KoOB, 


and call 9 the ring of integers of K. The symbol ‘9’ is a Gothic capital 
O (for ‘order’, the old terminology). In cases where it is not immediately 
clear which number field is involved, we write more explicitly Ox. Since 
K and B are subrings of C it follows that 0 is a subring of K. Further 
ZCQCKandZCBs0ZCoD. 

The following lemma is easy to prove: 


Lemma 2.11. Ifa eK then for some non-zero c € Z we have cae 0. 


Corollary 2.12. If K is a number field then K = Q(@) for an algebraic 
integer 9. 


Proof: We have K = Q(¢) for an algebraic number ¢ by Theorem 2.2. 
By Lemma, 2.11, 9 = c¢ is an algebraic integer for some 0 #4 c € Z. Clearly 
Q(d) = Q(6). O 


Warning. For @ € C let us write Z[6] for the set of elements p(6), for 
polynomials p € Z[é]. If K = Q(6) where @ is an algebraic integer then 
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certainly D contains Z[6] since 0 is a ring containing @. However, 0 need 
not equal Z[6]. For example, Q(/5) is a number field and 75 an algebraic 
integer. But 


14+ V5 
2 


is a zero of ¢? — ¢ — 1, hence an algebraic integer; and it lies in Q(/5) so 
belongs to . It does not belong to Z[V5]. 


There is a useful criterion, in terms of the minimum polynomial, for a 
number to be an algebraic integer: 


Lemma 2.13. An algebraic number a is an algebraic integer if and only if 
its minimum polynomial over Q has coefficients in Z. 


Proof: Let p be the minimum polynomial of a over Q, and recall that this 
is monic and irreducible in Q[t]. If p € Z[t] then a is an algebraic integer. 
Conversely, if a is an algebraic integer then g(a) = 0 for some monic 
q € Z{t], and pl|g. By Gauss’s Lemma 1.7 it follows that p € Z[t], because 
some rational multiple Ap lies in Z[é] and divides g, and the monicity of ¢ 
and p implies A = 1. 


To avoid confusion as to the usage of the word ‘integer’ we adopt the 
following convention: a rational integer is an element of Z, and a plain 
integer is an algebraic integer. (The aim is to reserve the shorter term 
for the concept most often encountered.) Any remaining possibility of 
confusion is eliminated by: 


Lemma 2.14. An algebraic integer is a rational number if and only if it is 
a rational integer. Equivalently, BN Q = Z. 


Proof: Clearly ZC BNQ. Let a € BN Q; since a € Q its minimum 
polynomial over Q is t— a. By Lemma 2.13 the coefficients of this are in 
Z, hence —a € Z, hence a € Z. O 


2.4 Integral Bases 


Let K be a number field of degree n (over Q). A basis (or Q-basis for 
emphasis) of K is a basis for K as a vector space over Q. By Corollary 
2.11 we have K = Q(6) where @ is an algebraic integer, and it follows that 
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the minimum polynomial p of 6 has degree n and that {1,6,... ,6"~1} is 
a basis for K. 

The ring 0 of integers of K is an abelian group under addition. A 
Z-basis for (0D, + ) is called an integral basis for K (or for D). Thus 
{ai,... ,@s, } is an integral basis if and only if all a; € D and every element 
of O is uniquely expressible in the form 


Q,0, +...+ 4.05 


for rational integers a,,...,a,. It is obvious from Lemma 2.11 that any 
integral basis for K is a Q-basis. Hence in particular s = n. But we have 
to verify that integral bases exist. In fact they do, but they are not always 
what naively we might expect them to be. 

For instance we can assert that K = QJ[6] (= Q(6)) for an algebraic 
integer 6 (Corollary 2.12), so that {1,0,... ,6"~1} is a Q-basis for K which 
consists of integers, but it does not follow that {1,6,... ,@"—1} is an integral 
basis. Some of the elements in Q[6] with rational coefficients may also be 
integers. As an example, consider K = Q(v5). We have seen that the 
element 4+ $¥V5 satisfies the equation 


#@—t+1=0 


and so is an integer in Q(V5), but it is not an element of Z[V5]. 

Our first problem, therefore, is to show that integral bases exist. That 
they do is equivalent to the statement that (0, + ) is a free abelian group 
of rank n. To prove this we first establish: 


Lemma 2.15. If {a01,...,Qn} is a basis of K consisting of integers, then 
the discriminant Alay,... ,Qn| is a rational integer, not equal to zero. 


Proof: We know that A = A[ay,... ,a,] is rational by Theorem 2.7, and 
it is an integer since the a; are. Hence by Lemma 2.14 it is a rational 
integer. By Theorem 2.7, A £ 0. O 


Theorem 2.16. Every number field K possesses an integral basis, and the 
additive group of D is free abelian of rank n equal to the degree of K. 


Proof: We have K = Q(@) for @ an integer. Hence there exist bases for K 
consisting of integers: for example {1,6,... ,6"~1}. We have already seen 
that such Q-bases need not be integral bases. However, the discriminant of 
a Q-basis consisting of integers is always a rational integer (Lemma 2.15), 
so what we do is to select a basis {wi,...,w,} of integers for which 


|Alwi,.-. ,Wn]| 
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is least. We claim that this is in fact an integral basis. If not, there is an 
integer w of K such that 


W=4jW, +... + AnWn 


for a; € Q, not all in Z. Choose the numbering so that a, ¢ Z. Then 
a, =a+r where a € Zand 0 <r <1. Define 


b, =w — aw), Wi = Wj (i =2,...,n). 


Then {¥1,... ,%n} is a basis consisting of integers. The determinant rele- 
vant to the change of basis from the w’s to the w’s is 


aj—-Q@ a2 ag wee On 
0 1 0. .... O 
0 0 1... 0 |=r, 
0 0 oO... 1 


and so 
Ali, ... Un] =r7Alwr,... ,wrl- 


Since 0 < r < 1 this contradicts the choice of {w),...,w,} making 
|A[w1,... ,W,]| minimal. 

It follows that {w1,...,wW,} is an integral basis, and so (0, + ) is free 
abelian of rank n. Oo 


This raises the question of finding integral bases in cases such as Q(/5) 
where the Q-basis {1, V5} is not an integral basis. We shall consider a more 
general case in the next chapter, but this particular example is worth a brief 
discussion here. 

An element of Q(/5) is of the form p + qv5 for p,q € Q, and has 
minimum polynomial 


(t —p — qv5)(t —p + qv5) = t? — 2pt + (p? — 5q’). 


Then p+ qV/5 is an integer if and only if the coefficients 2p, p? — 5q? are 
rational integers. Thus p = 3P where P is a rational integer. For P even, 
we have p? a rational integer, so 5q? is a rational integer also, implying q 
is a rational integer. For P odd, a straightforward calculation (performed 
in the next chapter in greater generality) shows q = 3Q where Q is also an 
odd rational integer. 

From this it follows that 0 = Z[4 + 4/5] and an integral basis is 


{1,3 + 3v5}- 
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We can prove this by another route using the discriminant. The two 
monomorphisms Q(/5) > C are given by 
oi(ptav5) = p+avs, 
o2(p + qv5) p—qv5. 


Hence the discriminant A[1, 3 + 3/5] is given by 
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1 i-Ns 
We define a rational integer to be squarefree if it is not divisible by the 
square of a prime. For example, 5 is squarefree, as are 6, 7, but not 8 or 9. 
Given a Q-basis of K consisting of integers, we compute the discriminant 
and then we have: 


2 
= 65. 


Theorem 2.17. Suppose ay, ..., Qn € O form a Q-basis for K. If 
A[ai,... , Qn] is squarefree then {a1,...,Qn} is an integral basis. 


Proof: Let {(1,...,8,} be an integral basis. Then there exist rational 
integers cj; such that a; = Lcj;G;, and 


Alan,.-- ; Qn] = (det ¢3)?A[A,-.- , Bnl- 


Since the left-hand side is squarefree, we must have det c;; = +1, so that 
(cj) is unimodular. Hence by Lemma 1.15 {a1,... ,@,} is a Z-basis for 
O, that is, an integral basis for K. oO 


For example, the Q-basis {1, 5 + i V5} for Q(/5) consists of integers 
and has discriminant 5 (calculated above). Since 5 is squarefree, this is an 
integral basis. The reader should note that there exist integral bases whose 
discriminants are not squarefree (as we shall see later on), so the converse 
of Theorem 2.17 is false. 

For two integral bases {a,..., Qn}, {G1,--. , Gn} of an algebraic num- 
ber field K, we have 


Alay,... ;@n] = (£1)*AlA1,... ; Bn] = AlB1,--- Bn]; 


because the matrix corresponding to the change of basis is unimodular. 
Hence the discriminant of an integral basis is independent of which integral 
basis we choose. This common value is called the discriminant of K (or 
of D). It is always a non-zero rational integer. Obviously, isomorphic 
number fields have the same discriminant. The important role played by 
the discriminant will become apparent as the drama unfolds. 
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2.5 Norms and Traces 


These important concepts often allow us to transform a problem about 
algebraic integers into one about rational integers. As usual, let K = Q(@) 
be a number field of degree n and let o1,...,0, be the monomorphisms 
K — C. Now the field polynomial is a power of the minimum polynomial 
by Theorem 2.6(a), so by Lemma 2.13 and Gauss’s Lemma 1.7 it follows 
that a € K is an integer if and only if the field polynomial has rational 
integer coefficients. For any a € K we define the norm 


Nx (a) = II oi(a) 
and trace 
Tx(a) = >> oi(a). 


i=l 


Where the field K is clear from the context, we will abbreviate the norm 
and trace of a to N(a) and T(a) respectively. 
Since the field polynomial is 


n 
fa(t) = [[@¢- ox(a)) 
i=1 
it follows from the remark above that if a is an integer then the norm and 


trace of a are rational integers. Since the o; are monomorphisms it is clear 
that 


N(aZ) = N(@)N(@) (2.4) 
and if a £0 then N(a) #0. If p,q are rational numbers then 
T(pa + 93) = pT (a) + qT (A). (2.5) 


For instance, if K = Q(/7) then the integers of K are given by D = 
Z|/7| (as we shall see in Theorem 3.2). The maps o; are given by 
oi(p+qv7) = p+av7, 
ox(p+qv7) = p—aqv7. 


Hence 


N(p+qv7) = p?—74, 
T(p+qv7) = 2p. 
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Since norms are not too hard to compute (they can always be found 
from symmetric polynomial considerations, often with short-cuts) whereas 
discriminants involve complicated work with determinants, the following 
result is sometimes useful: 


Proposition 2.18. Let K = Q(6) be a number field where 0 has minimum 
polynomial p of degree n. The Q-basis {1,0, ... , 0"~1} has discriminant 


A [1,... 6-1] = (-1)"-Y/2N(Dp(6)) 


where Dp is the formal derivative of p. 


Proof: From the proof of Theorem 2.7 we obtain 
A=A{l,6,...,a°"]= |] 4) 
1<i<j<n 


where 0),... , 9, are the conjugates of 6. Now 


v(t) = J] - 6 


i=1 
so that 


nr n 


Dp(t) = >, [[@- 4%) 
mig 


and therefore 
n 
Dp(6;) = |] (6; — 4). 
i 
Multiplying all these equations for 7 = 1,... ,n we obtain 


T[>%@,) = IL — 6) 


4,j=1 
tA 


The left-hand side is N(Dp(@)). On the right, each factor (0; —0;) for i < j 

appears twice, once as (6; — 0;) and once as (6; — 4;). The product of these 

two factors is —(0;—0;)?. On multiplying up, we get A multiplied by (—1)° 

where s is the number of pairs (7,7) with 1 <i <j <n, which is given by 
s=4n(n-1). 


The result follows. O 
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We close this chapter by noting the following simple identity linking the 
discriminant and trace: 
Proposition 2.19. If {a1,...,Qn} is any Q-basis of K, then 
Alai,... , Qn] = det(T(aja;)). 
Proof: T(aiaj) = yy or (aaj) = r_) or (a4)o-(a;). Hence 
Alai,...,Q@n] = (det(o;(a;)))? 
=  (det(a;(ai)))(det(oi(a;))) 


= det(S> 5, (ai)or(a5)) 


r=1 


= det(T(a;a;)). Oo 


2.6 Rings of Integers 


We now discuss how to find the ring of integers of a given number field. 
With the methods available to us, this involves moderately heavy calcu- 
lation; but by taking advantage of short cuts the technique can be made 
reasonably efficient. In particular we show in Example 2.3 below that not 
every number field has an integral basis of the form {1,6,... ,@"—1}. 

The method is based on the following result: 


Theorem 2.20. Let G be an additive subgroup of D of rank equal to the de- 
gree of K, with Z-basis {a1,... ,an}. Then |O/G|* divides Alay,... , an]. 


Proof: By Theorem 1.16 there exists a Z-basis for D of the form {f,... , Bn} 
such that G has a Z-basis {141/1,... , UnGn} for suitable p; € Z. Now 


Alan, . es An] = Alpi fi, - .- > HnPr] 


since by Lemma 1.15 a basis-change has a unimodular matrix; and the 
right-hand side is equal to 


(t41 «ftp? Alf, +» Bn] = (fr ---tn)2A 


where A is the discriminant of K and so lies in Z. But 


#1 ---Hn| = |0/G]. 
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Therefore 
|D/G)?_ divides A[ay,... , Qn]. oO 
In the above situation we use the notation 
Ag = Alay, .-- , Qn]- 
We then have a generalization of Theorem 2.17: 


Proposition 2.21. Suppose that G # 9. Then there exists an algebraic 
integer of the form 


1 
poe tolet An@n) (2.6) 


where0< 4 <p—1, 4 € Z, and p is a prime such that p* divides Ag. 


Proof: If G #9 then |O/G| > 1. Therefore (by the structure theory for 
finite abelian groups) there exists a prime p dividing |O/G| and an element 
u € O/G such that g = pu € G. By Theorem 2.20, p? divides Ag. Further, 


1 1 
“u= Pl pith ese anen) 


since {a;} forms a Z-basis for G. O 


Note that this really is a generalization of Theorem 2.17: if Ag is 
squarefree then no such p exists, so that G= 9. 


We may use Proposition 2.21 as the basis of a trial-and-error search for 
algebraic integers in 0 but not in G, because there are only finitely many 
possibilities (6). The idea is: 


(a) Start with an initial guess G for O. 
(b) Compute Ag. 


(c) For each prime p whose square divides Ag, test all numbers of the 
form (2.6) to see which are algebraic integers. 


(d) If any new integers arise, enlarge G to a new G’ by adding in the 
new number (and divide Ag by p? to get Ag’). 


(e) Repeat until no new algebraic integers are found. 
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Example 2.22. Find the ring of integers of Q(7/). 


Let 0 € R, 6% = 5. The natural first guess is that D has Z-basis 
{1,0, 67}. Let G be the abelian group generated by this set. Let w = 27/3 
be a cube root of unity. We compute 


1 @ 6? 
Ag = 1 w6 w?6? 
1 w?0 wé? 
ime ie ks 
= 6/1 w w? 
1 w? w 


= 67. (w+? +0? —w—w—w)? 

= 57.37. (ww)? 

= 93?.5?.(—3) 

= —39.5?. 
By Proposition 2.21 we must consider two possibilities. 
(a) Can a = $(A1 +A29+ A360”) be an algebraic integer, for 0 < A; < 2? 
(b) Can a = (Ay +2 +36) be an algebraic integer, for 0 < A; < 4? 
Consider case (b), which is harder. First use the trace: we have 

T(a) =3\1/5€Z 
so that A; € 5Z. Then 


1 
a’ = 5 (A28 + A367) 


is also an algebraic integer. 

Now compute the norm of a’. (It is easier to do this for a’ than for a 
because there are fewer terms, which is why we use the trace first.) We 
have 


N(a6 + b6) (a0 + b6”) (aw + bw? 6?) (aw? 6 + bw) 
= w-+w"(ad + b67)(a0 + wb6?)(a6 + w7b6?) 
(a0)? + (b6?)8 


5a? + 25b°. 
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It follows that for a to be an algebraic integer, we must have N(a’) € Z. 


But N(a’) = (543 + 25A3)/125 = (A3 + 5A3)/25. One way to finish the 
calculation is just to try all cases: 


x XE +5A3 | Divisible by 257 


No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 
No 


0] 
0.) 
jos] 
Oe | 
2 
| 2 | 
[ze] 
| 4 
as 


384 


Whichever argument we use, we have shown that if there are no better 
ideas, brute force can suffice. But here it is not hard to find a better idea. 
Suppose A3 + 53 = 0 (mod 25). If A3 = 0 (mod 5), then we must also 
have Az = 0 (mod 5). If not, we have 5 = (—Az2/A3)° (mod 25). Therefore 
5 is a cubic residue (mod 25), that is, is congruent to a cube. The factor 
5 shows that we must have 5 = (5k)? (mod 25), but then 5 = 0 (mod 25), 
an impossibility. 

Whichever argument we use, we have shown that no new a’ occurs in 
case (b). The analysis in case (a) is similar, and left as Exercise 6 in this 
chapter. 
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Note that it is necessary for N(a) and T(qa) to be rational integers, in 
order for a to be an algebraic integer; but it may not be sufficient. If the 
use of norms and traces produces a candidate for a new algebraic integer, 
we still have to check that it is one—for example, by finding its minimum 
polynomial. However, our main use of N(a) and T(a) is to rule out possible 
candidates, so this step is not always needed. 


Example 2.23. 
(a) Find the ring of integers of Q(/175). 


(b) Show that it has no Z-basis of the form {1,6, 67}. 


Solution: 
(a) Let t = Y/175 = 8/(5- 7). Consider also u = 5-72 = 7/245. We 
have 
ut = 365 
te TE 
? bu. 


Let © be the ring of integers of K = Q(#/175). 

We have u = 35/t € K. But u? — 245 = 0 so u € B. Therefore 
uE€BNK=9. 

A good initial guess is that DO = G, where G is the abelian group 
generated by {1,t, u}. 

To see if this is correct, we compute Ag. The monomorphisms K > C 
are 01, 02,03 where o1(t) = t, o2(t) = wt, o3(t) = wt. Since tu = 35 which 
must be fixed by each o;, we have o1(u) = u, o2(u) = w?u, o3(u) = wu. 
Therefore 


1 ¢ u |? 
Ag=|1 wt wu 


1 wt wu 


which works out as —33 . 5? - 7?. 

There are now three primes to try: p = 3,5, or 7. 

If p = 5 or 7 then, as in Example 2.22, use of the trace lets us assume 
that our putative integer is 5 (at + bu) for a,b, € Z. Now 


N(at + bu) = 175a? + 24503 


and we must see whether this can be congruent to 0 (mod 5? or 73) for a, 
b not congruent to zero. 
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Suppose 175a? + 2456? = 0 (mod 125), that is, 35a° + 496? = 0 (mod 
25). Write this as 10a? — b® = 0 (mod 25). If a = 0 (mod 5) then also 
b= 0 (mod 5). If not, 10 = (b/a)? (mod 25) is a cubic residue; but then 
10 = (5k)> (mod 25), hence 10 = 0 (mod 25) which is absurd. The case 
p =7 is dealt with in the same way. 

When p = 3 the trace is no help, and we must compute the norm of 


1 
g(a + bt + cu) 


for a,b,c € Z. The calculation is more complicated, but not too bad since 
we only have to consider a, b,c = 0,1, 2. No new integers occur. 
Therefore D = G as we hoped. 


(b) Now we have to show that there is no Z-basis of the form {1, 6,67}, 
where 0 = a+bt+cu. Note that {1,0,67} is a Z-basis if and only if 
{1,6+1, (9+1)*} is a Z-basis; so we may without loss of generality assume 
that a = 0. Now 


(bt + cu)? = bt? + 2bctu + cu? 
= 5b?u + 70be + 7c?t. 


Therefore {1, bt + cu, (bt + cu)?} is a Z-basis if and only if the matrix 


1 0 #0 
0 b c 
70be 7c? 5b? 
is unimodular; that is, 
5b? — 7c3 = + 1. 


Consider this modulo 7. Cubes are congruent to 0, 1, or —1 (mod 7), so 
we have 5(—1, 0, or 1) = + 1 (mod 7), a contradiction. 
Hence no such Z-basis exists. 


Example 2.24. Find the ring of integers of Q(V2, 4). 


(Here, our initial guess turns out not to be good enough, so this ex- 
ample illustrates how to continue the analysis when this unfortunate event 


occurs.) 
The obvious guess is {1, /2,i,i/2}. Let G be the group these generate. 
We have Ag = —64, so 0 may contain elements of the form $9 (and then 


possibly $9 or $9) for g € G. The norm is 


N(a + bV2 + ci + div2) = (a? — c? — 2b? + 2d”)? + 4(ac — 2bd)?. 
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We must find whether this is divisible by 16 for a,b,c,d = 0 or 1, and not 
all zero. By trial and error the only case where this occurs is b = d = 1, 
a=c=0. So 


so that 


and a is an integer. 
We therefore revise our initial guess to 


G! = {1,6, i, 63, 16(1 + 4}. 
Since 2- 40(1+%) = 6+ 6: this has a Z-basis 
{1,6,i, 26(1 +4)}. 
Now 
Aq = —64/2? = —16. 


A recalculation of the usual kind shows that nothing of the form $ g (where 
we may now assume that the term in 3O(1 +4) occurs with nonzero coeffi- 
cient) has integer norm. So no new integers arise and D = G’. 


2.¢ Exercises 


1. Which of the following complex numbers are algebraic? Which are 
algebraic integers? 

a) 355/113 

b) e2rt/23 

c) eti/28 


d) V17+ V19 
e) (1+ V17)/(2V—19) 


f) / (1+ v2) + 4/(1- v2). 


ON ON EN ONO 
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10. 


11. 


12. 
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. Express Q(V3, 7/5) in the form Q(6). 

. Find all monomorphisms Q(¥/7) > C. 

. Find the discriminant of Q(/3, V5). 

. Let K = Q(¥2). Find all monomorphisms o : K —> C and the 


minimum polynomials (over Q) and field polynomials (over K) of 
(i) Y2 (ii) V2 (iii) 2. (iv) V2 +1. Compare with Theorem 2.6. 


. Complete Example 2.22 above by discussing the case p = 3. 
. Complete Example 2.23 above by discussing the case p = 3. 


. Compute integral bases and discriminants of 


(a) Q(v2, v3) 
(b) Q(v2,4) 
(c) Q(72) 

(d) Q(¥2). 


. Let K = Q(6) where 6 € Ox. Among the elements 


1 : 
q(ao +++. + a:6") 


(0 £ ai3a0,... ,a; € Z), where d is the discriminant, pick one with 
minimal value of |a;| and call it z;. Do this fori = 1,... ,n =[K : Q] 
show that {21,... , Zn} is an integral basis. 


If a1,...,Q@p,, are Q-linearly independent algebraic integers in Q(6), 
and if 


Alay,... An] =d 


where d is the discriminant of Q(6), show that {a;,...,a@,} is an 
integral basis for Q(@). 


If [K : Q] =n, a € Q, show 


Nx(a) = a”, 
Tx (a) 


na. 


Give examples to show that for fixed a, Nx(a) and Tx(a) depend 
on K. (This is to emphasize that the norm and trace must always be 
defined in the context of a specific field K; there is no such thing as 
the norm or trace of a without a specified field.) 


2.7. 


13. 


14. 


Exercises 59 


The norm and trace may be generalized by considering number fields 
K > L. Suppose K = L(@) and [K : L] = n. Consider monomor- 
phisms o : K + C such that o(x) = z for all x € L. Show that there 
are precisely n such monomorphisms oj,... ,0, and describe them. 
For a € K, define 


Nx x(a) = [Tei(o), 


Txyt(a) = dala). 


(Compared with our earlier notation, we have Nx = Nx /Q, Tk = 
Tx/q:) Prove that 


Nx/x(a102) = Nxjr(a1)Nx;1(a2), 


Tx/1(a1 + a2) = Tx/1(a1) + Tx;z(a2). 


Let K = Q(¥73), L = Q(v3). Calculate Nxyr(Va), Tx/z(a) for 
a= ¥%3 anda= 93+ V3. 


For K = Q(¥7/3), L = Q(¥V3), calculate Nx/z(V3) and Nx/q(v3). 
Deduce that Nx ;z(@) depends on K and L (provided that a € K). 
Do the same for Tx/z. 
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Quadratic and 
Cyclotomic Fields 


In this chapter we investigate two special cases of number fields in the 
light of our previous work. The quadratic fields are those of degree 2, and 
are especially important in the study of quadratic forms. The cyclotomic 
fields are generated by pth roots of unity, and we consider only the case 
p prime; it is these which are central to Kummer’s approach to Fermat’s 
Last Theorem and play a substantial role in all subsequent work, including 
Wiles’s proof. We shall return to both types of field at later stages. For 
the moment we content ourselves with finding the rings of integers, integral 
bases, and discriminants. 


3.1 Quadratic Fields 


A quadratic field is a number field K of degree 2 over Q. Then K = Q(6) 
where @ is an algebraic integer, and @ is a zero of 


Ptat+b (a,beZ). 
Thus 


—a+ ./(a? — 4b) 
6= = 


61 
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Let a? — 4b = r2d where r,d € Z and d is squarefree. (That this is always 
possible follows from prime factorization in Z.) Then 


-atrvd 


0= 5 


and so Q(0) = Q(Vd). Hence we have proved: 


Proposition 3.1. The quadratic fields are precisely those of the form Q(Vd) 
for d a squarefree rational integer. O 


Next we determine the ring of integers of Q(Vv4), for squarefree d. The 
answer, it turns out, depends on the arithmetic properties of d. 


Theorem 3.2. Let d be a squarefree rational integer. Then the integers of 


Q(v) are: 


(a) Z[Vd] if d #1 (mod 4), 
(b) Z[2 + 4Vd] if d =1 (mod 4). 


Proof: Every element a € Q(V/d) is of the form a =r +sV4 for r,s € Q. 
Hence we may write 


a+bVd 
= 


where a,b,c € Z, c > 0, and no prime divides all of a,b,c. Now a is an 
integer if and only if the coefficients of the minimum polynomial 


(CC) 


are integers. Thus 


a EZ, (3.1) 
2a 


If ¢ and a have a common prime factor p then (3.1) implies that p divides 
b (since d is squarefree) which contradicts our previous assumption. Hence 
from (3.2) we have c = 1 or 2. If c = 1 then a is an integer of K in any 
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case, SO we may concentrate on the case c = 2. Now a and b must both be 
odd, and (a? — b?d)/4 € Z. Hence 


a* —b?d=0 (mod 4). 


Now an odd number 2k + 1 has square 4k? + 4k + 1 = 1 (mod 4), hence 
a? =1=0? (mod 4), and this implies d= 1 (mod 4). Conversely, if d= 1 
(mod 4) then for odd a, b we have a an integer because (3.1) and (3.2) 
hold. 

To sum up: if d = 1 (mod 4) then c = 1 and so (a) holds; whereas if 
d = 1 (mod 4) we can also have c = 2 and a, b odd, whence easily (b) 
holds. 


The monomorphisms K —> C are given by 


oi(rt+sVd) = r+sva, 
on(r+svd) = r—svad. 


Hence we can compute discriminants: 

Theorem 3.3. (a) If d #1 (mod 4) then Q(V/d) has an integral basis of 
the form {1, Vd} and discriminant 4d. (b) If d = 1 (mod 4) then Q(V/d) 
has an integral basis of the form {1,4 + 4d} and discriminant d. 


Proof: The assertions regarding bases are clear from Theorem 3.2. Com- 
puting discriminants we work out: 


= (-2v4d)? = 4d, 


HH 
2 


Se 


w 


+ 
NlRNIR 


(-va)? = d. 


1 
1 


NIP 


O 


Since the discriminants of isomorphic fields are equal, it follows that for 
distinct squarefree d the fields Q(/d) are not isomorphic. This completes 
the classification of quadratic fields. 

A special case, of historical interest as the first number field to be 
studied as such, is the Gaussian field Q(/—1). Since —1 4 1 (mod 4) the 
ring of integers is Z[,\/—1] (known as the ring of Gaussian integers) and the 
discriminant is —4. 

Incidentally, these results show that Theorem 2.17 is not always ap- 
plicable: an integral basis can have a discriminant which is not squarefree. 
For instance, the Gaussian integers themselves. 
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For future reference we note the norms and traces: 


N(r + sVd) r? — ds?, 
T(r+sVd) = 2r. 


We also note some useful terminology. A quadratic field Q(V/d) is said to 
be real if d is positive, imaginary if d is negative. (A real quadratic field 
contains only real numbers, an imaginary quadratic field contains proper 
complex numbers as well.) 


3.2 Cyclotomic Fields 


A cyclotomic field is one of the form Q(¢) where ¢ = e?**/™ is a primitive 
complex mth root of unity. (The name means ‘circle-cutting’ and refers 
to the equal spacing of powers of ¢ around the unit circle in the complex 
plane.) We shall consider only the case m = p, a prime number. Further, 
if p = 2 then ¢ = —1 so that Q(¢) = Q, hence we ignore this case and 
assume p odd. 


Lemma 3.4. The minimum polynomial of ¢ = e?**/?, p an odd prime, over 


Q is 
f@) =P +P e+. tte 
The degree of Q(¢) is p—1. 


Proof: We have 


f=. 


Since ¢— 1+ 0 and ¢? = 1 it follows that f(¢) = 0, so all we need prove is 
that f is irreducible. This we do by a standard piece of trickery. We have 


sie 1)— APD yn ( 2 eo. 


ik 


Now the binomial coefficient (?) is divisible by p if 1 <r < p—1, and 
(?) = p is not divisible by p?. 

Hence by Eisenstein’s criterion (Theorem 1.8) f(t + 1) is irreducible. 
Therefore f(t) is irreducible, and is the minimum polynomial of ¢. Since 
Of = p—1 we have [Q(¢) : Q] = p—1 by Theorem 1.11. oO 
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The powers (,(?,... ,¢?~1 are also pth roots of unity, not equal to 1, 
and so by the same argument have f(¢) as minimum polynomial. Clearly 


fQ=¢-OQE-C)...E-P*) (3.3) 


and thus the conjugates of ¢ are (,¢?,...,¢?-!. This means that the 
monomorphisms from Q(¢) to C are given by 


a(gv=¢i  (1<i<p—1). 


Because the minimum polynomial f(t) has degree p — 1, a basis for Q(¢) 
over Q is 1,¢,... ,C?~?, so for a general element 


A=a9 +aiC +... +Gp-2¢?? (a; € Q) 
we have 
oi(ao t+... + ap_2(?~?) =ap t+ Cit... + ap_oCiP-?). 


From this formula the norm and trace may be calculated using the basic 
definitions 


Na) = TT os(a), 


p-1 
T(a) = So aia). 


i=1 
In particular 
N(Q) = 6-02... 


Now ¢ and ¢?(1 < i < p—1) are conjugates, so have the same norm, which 
can be calculated by putting ¢ = 0 in (3.3) to give 


N(¢) = N(¢*) = (-1)?* 
and since p is odd, 
N(¢‘)=1 (1<i<p-—1). (3.4) 
The trace of ¢* can be found by a similar argument. We have 


TIC) =TIQ=C+ C7 +...4+07, 
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and using the fact that 
FQ H=14+04+...407=0 


we find 
T(¢)=-1 (1<i<p-—1). (3.5) 
For a € Q we trivially have 
N(a) = a? 
T(a) = (p—l)a. 
Since ¢? = 1, we can use these formulas to extend (3.4) and (3.5) to 
N(¢*)=1 ~— foralls EZ (3.6) 
and 
Oe Penryn (7 


For a general element of Q(¢), the trace is easily calculated: 
‘p—2 p-—2 
T (© a‘) a > T(ai¢') 
i=0 i=0 


p—2 
= T(ao)+ >> Tac’) 


i=1 


p-—2 


= (p—1)ay— > aj 
i=0 
and so 
p—2 p-—2 
T & a) = pag — > Oi. (3.8) 
i=0 i=1 
The norm is more complicated in general, but a useful special case is 
N(1—¢) = Tle — ¢') 


which can be calculated by putting t = 1 in (3.3) to obtain 


Tla- =. (3.9) 


i=l 
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N(1—¢) =p. (3.10) 


We can put these computations to good use, first by showing that the 
integers of Q(¢) are what one naively might expect: 


Theorem 3.5. The ring D of integers of Q(C) is ZC]. 


Proof: Suppose a = ap +ai¢+...+ dp_2¢?~” is an integer in Q(¢). 
We must demonstrate that the rational numbers a; are actually rational 


integers. 
For 0 < k < p— 2 the element 
al-* —a¢ 
is an integer, so its trace is a rational integer. But 
T(a¢-* — at) 

= T(ao¢-* +...tapt... + Ap—2¢P-*-? —agG—...- Qp-2¢?—*) 

= pap —(ag+...+ap—2) — (—ao —... — ap_a) 

= par. 


Hence by = pa, is a rational integer. 
Put A=1-¢. Then 


pa = bo tbig+...+bpo¢? ? 
Co + e1A +... + Cp—2r?? (3.11) 


where (substituting ¢ = 1 — » and expanding) 


a=S-a'(1)oex 


Since \ = 1 — ¢ we also have, symmetrically, 
p-2 


b = > /(-1) ( : ) Cj. (3.12) 


P med 


We claim that all c;, are divisible by p. Proceeding by induction, we may 
assume this for all c; with i < k —1, where 0 < k < p—2. Since cp = 
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bo +... + bp-2 = p(—T(a) + bg), we have p|co, so it is true for k = 0. Now 
by (3.9) 


p-1 
p= IIa —¢) 
i=l bat 
= (1—-¢? *]fa+¢+...4¢7) 
i=1 
= lk (3.13) 


where & € Z[¢] C ©. Consider (3.11) as a congruence modulo the ideal 
(A*+1) of D. By (3.13) we have 


p=0 (mod (A**")), 


and so the left-hand side of (3.11), and the terms up to c,_1A*—!, vanish; 
further the terms from c,4;A*+! onwards are multiples of \*+! and also 
vanish. There remains: 


ch.A*=0 = (mod (A*t)), 
This is equivalent to 
cpd* = prFtt 
for some ys € O, from which we obtain 
Ch = pA. 

Taking norms we get 

cf" = N(cx) = N(w)N(A) = PN(y), 
since N(A) = p by (3.10). Hence plc?*, so plex. Hence by induction plc; 


for all k, and then (3.12) shows that p|b, for all k. Therefore a, € Z for 
all & and the theorem is proved. O 


Now we can compute the discriminant. 


Theorem 3.6. The discriminant of Q(¢), where ¢ = e?**/P and p is an odd 
prime, is 


(ye - pp, 
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Proof: By Theorem 3.5 an integral basis is {1,¢,... ,¢?~?}. Hence by 
Proposition 2.18 the discriminant is equal to 


(—1)®-D@-2)/2 -N(Df(¢)) 


with f(t) as above. Since p is odd the first factor reduces to (—1)®-)/?. 
To evaluate the second, we have 


?-1 
so that 
_ (t= 1)pt?-? — (# - 1) 
hr sae 
whence 
=e 
Df) = 
where A = 1 — ¢ as before. Hence 
_ N(p)N(¢)?™* 
N(Df(¢)) = — NO): 
(pret 
Dp 
= pP-?. Oo 


The case p = 3 deserves special mention, for Q(¢) has degree p—1 = 2, 
so it is a quadratic field. Since 


eami/a_ 1 +v—-3 
2 
it is equal to Q(./—3). As a check on our discriminant calculations: 


Theorem 3.3 gives —3 (since —3 = 1 (mod 4)), and Theorem 3.6 gives 
(—1)?/23! = —3 as well. 


3.3 Exercises 


1. Find integral bases and discriminants for: 
(a) Q(v3) 
(b) Q(V—7) 
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(c) Q(v11) 
(d) Q(v—-11) 
(e) Q(v6) 
(£) Q(v-6) 


. Let K = Q(¢) where ¢ = e?**/5, Calculate Nx (a) and Tx (a) for the 


following values of a: 


(i) C7 (Gi) C4 2 (i) 1464 CH C40. 


. Let K = Q(¢) where ¢ = e?**/P for a rational prime p. In the ring of 


integers Z[¢], show that a € Z[C] is a unit if and only if Nx(a) = +1. 


. If ¢ = e*/3, K = Q(O), prove that the norm of a € ZC] is of the 


form +(a?+3b”) where a, b are rational integers which are either both 
even or both odd. Using the result of Exercise 3, deduce that there 
are precisely six units in Z[¢] and find them all. 


. If ¢ = e?*/5, K = Q(6), prove that the norm of a € Z[¢] is of the 


form 4(a”—5b*) where a, b are rational integers. (Hint: in calculating 
N(aq), first calculate 01(a)o4(a) where o;(¢) = ¢*. Show that this is 
of the form q+ r6 + s@ where q, r, s are rational integers, 9 = ¢+ ¢+, 
¢ = C74 ¢3. In the same way, establish o2(a)o3(a) = q + 804+ r¢.) 
Using Exercise 3, prove that Z[¢] has an infinite number of units. 


. Let ¢ = e?*/5. For K = Q(¢), use the formula Nx(a+ b¢) = (a® + 


b°)/(a + b) to calculate the following norms: 
(i) Nx(¢ + 2) (ii) Nx(¢ — 2) (iii) Nx(¢ + 8). 
Using the fact that if a@ = 7, then Nx(a)Nx(@8) = Nx(y), deduce 


that ¢+ 2, ¢ —2, ¢+3 have no proper factors (i.e. factors which are 
not units) in Z[C]. 


Factorize 11, 31, 61 in Z[¢]. 


. If ¢ = e?7*/5, as in Exercise 6, calculate 


(i) Nx(¢ +4) (ii) Nx(¢ — 3). 

Deduce that any proper factors of ¢ + 4 in Z[¢] have norm 5 or 41. 
Given ¢ — 1 is a factor of ¢ + 4, find another factor. Verify ¢-—3 isa 
unit times (€? + 2)? in Z[¢]. 


. Show that the multiplicative group of non-zero elements of Zy is 


cyclic with generator the residue class of 3. If ¢ = e?**/7, define the 
monomorphism 0 : Q(¢) — C by o(¢) = ¢3. Show that all other 
monomorphisms from Q(C¢) to C are of the form o(1 <i < 6) where 
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10. 


o® = 1. For any a € Q(6), define c(a) = ao?(a)o*(a), and show 
N(a) = c(a) - oc(a). Demonstrate that c(a) = o%c(a) = otc(a). 
Using the relation 1+ ¢+...+¢° = 0, show that every element 
a € Q(¢) can be written uniquely as S~°_, aiC*(a; € Q). Deduce 
that c(a) = a9; + a362 where 0, = ¢ + ¢? + eet 62 = @ + ¢b + ce. 
Show 61-+62 = —1and calculate 6,62. Verify that c(a) may be written 
in the form bo + 610; where bo, b1 € Q, and show oc(a) = bo + 6142. 
Deduce 


N(q) = 62 — bob; + 207. 
Now calculate N(¢ + 5°). 


. Suppose p is a rational prime and ¢ = e?7*/?. Given that the group 


of non-zero elements of Z, is cyclic (see Appendix 1, Proposition 6 
for a proof) show that there exists a monomorphism ¢ : Q(¢) > C 
such that o?—1 is the identity and all monomorphisms from Q(¢) 
to C are of the form o*(1 < i < p—1). If p—1 = kr, define 
cx(a) = ao" (a)o?"(a)...¢%-Y" (a). Show 


N(a@) = ce (a) - oc, (a) ...07*cx,(a). 


Prove every element of Q(¢) is uniquely of the form ay a,¢*, and 
by demonstrating that 0” (cx(a)) = cz (a), deduce that c,(a) = b1m+ 
... + bpn, where 


m =¢+0"(¢)+07"(¢) +... 40D" (¢) 


and n+1 = o*(m). 
Interpret these results in the case p = 5, k = r = 2, by showing 
that the residue class of 2 is a generator of the multiplicative group 
of non-zero elements of Z;. Demonstrate that c2(a) is of the form 
bym + bana where m =¢+¢4, m =¢? + ¢3. 
Calculate the norms of the following elements in Q(¢): 
(i) ¢ + 2¢? (ii) C+ C4 (iii) 15¢ + 15¢4 (iv) C+ P+ 4-4. 
In Z[,/—5], prove 6 factorizes in two ways as 

6=2-3=(1+ V—5)(1— V-5) 


Verify that 2,3, 1++/—5, 1—~/—5 have no proper factors in Z[./—5]. 
(Hint: Take norms and note that if y factorizes as y = af, then 
N(y) = N(a)N(§) is a factorization of rational integers.) Deduce 
that it is possible in Z[,/—5] for 2 to have no proper factors, yet 2 
divides a product a@ without dividing either a or £. 
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Factorization into Irreducibles 


Now we turn to the vexed question of uniqueness of factorization in the 
ring of integers of an algebraic number field. Historically, experience with 
unique factorization of integers and polynomials over a field led to a general 
intuition that factorization of algebraic integers should also be unique. In 
the early days of algebraic number theory many experts, including Euler, 
simply assumed uniqueness without perceiving any need for a proof, and 
used it implicitly to ‘prove’ results that were later found to be based on 
a false assumption. The reason for this misconception is subtle, and has 
its origins in the definition of a prime number. There are two distinct 
properties that can be used to serve as a definition. The most familiar 
of these is that a prime number cannot be factorized into the product of 
two integers other than itself and 1. For a prime p, this property may be 
written more generally as: (a) If p = ab then one of a or b must be a unit. 

In Z the only units are +1, so this reduces to the usual definition. How- 
ever, there is a second property of prime numbers that is also of interest, 
namely that if a prime number p divides a product of two numbers then it 
must divide one or the other: (b) If plab then pla or p\b. 

What fooled our predecessors was that although these properties are 
equivalent in the ring of integers—and even in some algebraic number 
rings—they are not equivalent in all cases. Of deeper psychological sig- 
nificance is that the more familiar property (a) proves to be less powerful 
than property (b). Property (b) can be shown in general to imply (a); 
moreover, it is property (b) that guarantees uniqueness of factorization, 
not the more comfortable property (a). In contrast, property (a) does 
not imply (b), and (a) turns out to be inadequate to give uniqueness of 
factorization. 
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The way out of this dilemma is to use the less familiar (b) for the 
definition of a prime. An element p satisfying the weaker assumption (a) 
is no longer called a prime: it is said to be irreducible. (We have used that 
word in the same sense earlier in this text.) It can then be proved that if 
factorization into primes (defined in the new sense) is possible, then it is 
unique. In contrast, factorization into irreducibles may not be unique even 
when it is possible. For instance, if we work in Z[,/—6], then there are two 
factorizations 6 = 2-3 and 6 = /—6-/—6. Here the elements 2, 3, /—6 are 
all irreducible (because they cannot be written as a product of nontrivial 
factors in Z[,/—6]); however, they are not prime. For instance, /—6 is a 
factor of 6 = 2-3, but it is not a factor of either 2 or 3 in the ring Z[/—6]. 

We must therefore proceed with care, and an awareness of precisely 
what we are doing. For instance, if we attempt to factorize an element x 
in a domain D, it is natural to seek proper factors z = ab (meaning that 
neither a nor b is a unit). If either of these factors is further reducible, we 
factorize it, and so on, seeking a factorization 


@ = a 102...An 


into factors that cannot be reduced any further. Reflecting on what we 
are doing, we see that if this search for a factorization terminates, then 
it naturally leads to elements which are irreducible (definition (a)) rather 
than what we now call primes (definition (b)). So to begin with, we concern 
ourselves with factorization into irreducibles. 

In general a factorization into irreducibles may not be possible, because 
the procedure may continue indefinitely, but it is in a ring 0 of integers in 
any number field. (To prove this we introduce the notion of a noetherian 
ring, and show that factorization is always possible if the domain D is 
noetherian; we then demonstrate that 0 is noetherian.) 

Even though factorization into irreducibles is always possible in 9, we 
give an extensive list of examples where such a factorization is not unique. 
In other cases, however, the existence of a generalized version of the division 
algorithm (which we term a ‘Euclidean function’) implies that every irre- 
ducible is prime. We see that factorization into primes is unique, so some 
rings D possess unique factorization. In particular we characterize such 0 
for the fields Q(v4d) with d a negative rational integer: there are exactly 
five of them, corresponding to d = —1, —2, —3, —7,—11. We also prove the 
existence of a Euclidean function for some fields Q(/d) with d positive. In 
later chapters we shall see that 0 may have unique factorization without 
possessing a Euclidean function. 

To begin the chapter we consider a little history, and look at an example 
of the intuitive use of unique factorization, to motivate the ideas. 
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4.1 Historical Background 


In the 18th century and the first part of the 19th there were varying stan- 
dards of rigour in number theory. For example Euler, the most prolific 
mathematician of the 18th century, was concerned with obtaining results 
and would, on occasion, use intuitive methods of proof which the hindsight 
of history has shown to be incorrect. For instance, in his famous textbook 
on algebra, he made several elegant applications of unique factorization to 
‘prove’ number theoretic results in cases where unique factorization was 
false. Gauss, on the other hand, found it necessary to demonstrate rig- 
orously that the so-called ‘Gaussian integers’ Z|z] did factorize uniquely. 
In 1847, Lamé announced to a meeting of the Paris Academy that he 
had proved Fermat’s Last Theorem, but his proof was seen to depend on 
uniqueness of factorization and was shown to be inadequate. Kummer had, 
in fact, published a paper three years earlier that demonstrated the fail- 
ure of unique factorization for cyclotomic integers, thus destroying Lamé’s 
proof, but his publication was in an obscure journal and went unnoticed at 
the time. 


Eisenstein put his finger on the property that characterizes unique fac- 
torization in a letter of 1844 which translates 


If one had the theorem which states that the product of two complex 
numbers can be divisible by a prime number only when one of the 
factors is—which seems completely obvious—then one would have 
the whole theory at a single blow; but this theorem is totally false. 


By ‘the whole theory’ he was referring to consequences of unique fac- 
torization (in particular Fermat’s Last Theorem). 


In Eisenstein’s letter, ‘prime’ meant definition (a) of this chapter, and 
his comment translated into the terminology of this book is ‘if every irre- 
ducible is prime, then unique factorization holds’. It is also clear from his 
comment that he knew of instances of irreducibles which were not prime, 
which gave rise to non-uniqueness of factorization. All this must have 
seemed very confusing to average 19th century mathematicians who were 
used to using intuitive ideas about factorization to demonstrate results. To 
give the reader an idea of what it was like, before we explain the theory of 
unique factorization, we give a concocted proof of a statement of Fermat 
which uses this intuitive language. Fermat’s proof has not survived, and 
we are not suggesting that it resembled our faulty but instructive attempt 
below. Indeed it is not hard to reconstruct a rigorous proof using ideas 
that were known to Fermat: See Weil [80]. 
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A statement of Fermat. The equation y* + 2 = x° has only the integer 
solutions y= +5, 4 =3. 


Intuitive ‘proof’. Clearly y cannot be even, for then the right-hand side 
would be divisible by 8, but the left-hand side only by 2. We now factorize 
in the ring Z[./—2], consisting of all a + b\/—2 for a,b € Z, to give 


(y+ V—2)(y — V—2) = 2°. 


A common factor c + d/—2 of y+ /—2 and y — /—2 would also divide 
their sum 2y and their difference 2.,/—2. Taking norms, 


c? + 2d?|4y, ce? + 2d?|8, 


hence c? + 2d?|4. The only solutions of this relation are c = +1, d = 0, 
or c= 0, d= +1, or c= +2, d= 0. None of these give proper factors of 
y + V—2, so y + /—2 and y — /—2 are coprime. Now the product of two 
coprime numbers is a cube only when each is a cube, so 


y+ V—2 = (a+ bv—2)3, 
and comparing coefficients of //—2, 
1 = b(3a? — 267) 
for which the only solutions are b=1,a=+1. Thenz=3,y=+5. O 


The flaw in this intuitive ‘proof’, as it stands, is that we are carrying 
over the language of factorization of integers to factorization in Z[/—2| 
without checking that the usual properties actually hold in Z[,/—2]. In 
this chapter we develop the appropriate theory and investigate when it 
generalizes to a ring of integers in a number field. 


4.2 Trivial Factorizations 


If u is a unit in a ring R, then any element x € R can be trivially factorized 
as 


Z=uy 


where y= u-!z. An element y is called an associate of x if x = uy for a 
unit u. Recall that a factorization of « € R, x = yz is said to be proper if 
neither of y or z are units. If a factorization is not proper (in which case 
we shall call it trivial) then it is clear that one of the factors is a unit and 
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the other is an associate of z. Before going on to proper factorizations we 
therefore look at elementary properties of units and associates. We denote 
the set of units in a ring R by U(R). 


Proposition 4.1. The units U(R) of a ring R form a group under multipli- 
cation. Oo 


Examples. 
4.1 R=Q. The units are U(Q) = Q\{0}, which is an infinite group. 
4.2 R=Z. The units are + 1, so U(Z) is cyclic of order 2. 


4.3 R= Z[t], the Gaussian integers, a+ ib (a,b € Z). The element a+ ib 
is a unit if and only if there exists c+ id (c,d € Z) such that 


(a + 4b)(c + id) =1. 


This implies ac — bd = 1, ad + be = 0, whence c = a/(a? + Bb”), 
d = —b/(a?+b*). These have integer solutions only when a? +6? = 1, 
soa=+1,b=0,ora=0,b=+1. Hence the units are {1, —1, i, —i} 
and U(R) is cyclic of order 4. 


By using norms, we can extend the results of Example 4.3 to the more 
general case of the units in the ring of integers of Q(Vd) for d negative and 
squarefree: 


Proposition 4.2. The group of units U of the integers in Q(Vd) where d 
is negative and squarefree is as follows: 


(a) For d = —1, U = {+1, +4}. 
(b) For d= —3, U = {4 1,4 w, + w?} where w = €?™/3, 
(c) For all other d < 0, U = {+ 1}. 


Proof: Suppose a is a unit in the ring of integers of Q(Vd) with inverse 
8; then af = 1, so taking norms 


N(a)N(G) = 1. 


But N(q), N(@) are rational integers, so N(a) = + 1. Writing a = a+bVd 
(a,b € Q), then we see that N(a) = a? — db? is positive (for negative d), so 
N(a) = +1. Hence we are reduced to solving the equation 


a? — db? =1. 
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If a,b € Z, then for d = —1 this reduces to 
a+b? =1 


which has the solutions a = +1, b = 0, or a= 0, b = + 1, already found 
in Example 4.3. This gives (a). For d < —3 we immediately conclude that 
b = 0 (otherwise a? — db? would exceed 1), so the only rational integer 
solutions area = + 1,b=0. If d# 1 (mod 4), then a,b € Z, so the only 
solutions are those discovered. For d = 1 (mod 4), however, we must also 
consider the additional possibility a = A/2, b = B/2 where both A and B 
are odd rational integers. In this case 


A? — dB? =4. 


For d < —3, we deduce B = 0 and there are no additional solutions. This 
completes (c). For d = —3, we find additional solutions A=+1,B=+1. 
The case A= 1, B = 1 gives 


a = 3(-14 V-8) = e?¥/3 
Oe oe 


which we have denoted by w. The other three cases give —w,w*,—w*. 
These allied with the solutions already found give (b). O 


The general case of units in a ring of integers in a number field will be 
postponed until Appendix B. We now return to simple properties of units 
and associates. 

It follows from Proposition 4.1 that ‘being associates’ is an equivalence 
relation on R. The only associate of 0 is 0 itself. Recall that a non-unit 
x € Ris called an irreducible if it has no proper factors. The zero element 0 
= 0.0 has factors, neither of which is a unit, so in particular an irreducible 
is non-zero. We now list a few elementary properties of units, associates 
and irreducibles. To prove some of these we shall require the cancellation 
law, so we must take the ring to be an integral domain. 


Proposition 4.3. For a domain D, 
(a) x is a unit if and only if x|1, 
(b) Any two units are associates and any associate of a unit is a unit, 
(c) x, y are associates if and only if x|y and y|zx, 
(d) x is irreducible if and only if every divisor of x is an associate of x 
or a unit, 
(e) an associate of an irreducible is irreducible. 


Proof: Most of these follow straight from the definitions. We prove (c) 
which requires the cancellation law. Suppose z|y and y|z; then there exist 
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a,b € D such that y = az, x = by. Substituting, we find 
x = baz. 


Now either x = 0, in which case y = 0 also and they are associates, or 
x #0 and we cancel z to find 


1 = ba, 
so a and b are units. Hence x, y are associates. The converse is trivial. O 
Some of these ideas may be usefully expressed in terms of ideals: 


Proposition 4.4. If D is a domain and x, y are non-zero elements of D 
then 

(a) aly if and only if (x) 2 (y), 

(b) and y are associates if and only if (x) = (y), 

(c) x is a unit if and only if (x) = D, 

(d) x is irreducible if and only if (x) is maximal among the proper 
principal ideals of D. 


Proof: (a) If x|y then y = zx € (x) for some z € D, hence (y) € (2). 
Conversely, if (y) C (x) then y € (x), so y = zx for some z € D. 

(b) is immediate from (a). 

(c) If x is a unit then rv = 1 for some v € D, hence for any y € D we 
have y = xvy € (x) and D = (2). If D = (z) then since 1 € D, 1 = za for 
some z € D and z is a unit. 

(d) Suppose is irreducible, with (x) S (y) G D. Then y|x but is 
neither a unit, nor an associate of x, contradicting 4.3 (d). Conversely, if 
no such y exists, then every divisor of x is either a unit or an associate, so 
x is irreducible. O 


4.3 Factorization into Irreducibles 
In a domain D, if a non-unit is reducible, we can write it as 


x =ab. 


If either of a or b is reducible, we can express it as a product of proper 
factors; then carry on the process, seeking to write 


L = pPip2---Dm 
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where each p; is irreducible. We say that factorization into irreducibles is 
possible in D if every x € D, not a unit nor zero, is a product of a finite 
number of irreducibles. In general such a factorization may not be possible; 
and an example is ready to hand, namely the ring B of all algebraic integers. 
For if a is not zero or a unit, neither is /a. Since a = /a-/a and /a is 
an integer, it follows that a is not irreducible. Thus B has no irreducibles 
at all, but it does have non-zero non-units, so factorization into irreducibles 
is not possible. 

This trouble does not arise in the ring O of integers of a number field 
(which is another reason why we concentrate on such rings instead of the 
whole of B). We will prove the possibility of factorization in 0 by intro- 
ducing a more general notion which makes the arguments involved more 
transparent. We define a domain D to be noetherian if every ideal in D 
is finitely generated. The adjective commemorates Emmy Noether (1882— 
1935) who introduced the concept. Having demonstrated the possibility of 
factorization in any noetherian ring, we will show that 0 is noetherian, so 
factorization is possible here also. 

Two useful properties, which we shall see are each equivalent to the 
noetherian condition are: 

The ascending chain condition. Given an ascending chain of ideals: 


Wed Cx 223 5C <: (4.1) 


then there exists some N for which I, = Iy for all n > N. That is, every 
ascending chain stops. 


The maximal condition. Every non-empty set of ideals has a maximal 
element, that is an element which is not properly contained in every other 
element. 

We remark that this maximal element need not contain all the other 
ideals in the given set: we require only that there is no other element in 
the set that contains it. 


Proposition 4.5. The following conditions are equivalent for an integral 
domain D: 

(a) D is noetherian, 

(b) D satisfies the ascending chain condition, 

(c) D satisfies the mazimal condition. 


Proof: Assume (a). Consider an ascending chain as in (4.1). Let I = 
Use, In. Then I is an ideal, so finitely generated: say I = (21,...,2m). 
Each 2; belongs to some J,,;). If we let N = max; n(i), then we have 
I = Iy and it follows that I, = In for all n > N, proving (b). 
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Now suppose (b) and consider a non-empty set S of ideals. Suppose 
for a contradiction that S does not have a maximal element. Pick Ip € S. 
Since Ig is not maximal we can pick , € S with Ip G I,. Inductively, 
having found [,,, since this is not maximal, we can pick I,,1 € S with 
I, G In41. But now we have an ascending chain which does not stop, 
which is a contradiction. So (b) implies (c). (The reader who wishes to 
may ponder the use of the axiom of choice in this proof.) 

Finally, suppose (c). Let I be any ideal, and let S be the set of all 
finitely generated ideals contained in J. Then {0} € S, so S is non-empty 
and thus has a maximal element J. If J 4 I, pick x € I\ J. Then (J, 2) is 
finitely generated and strictly larger than J, a contradiction. Hence J = I 
and I is finitely generated. O 


Theorem 4.6. If a domain D is noetherian, then factorization into irre- 
ducibles is possible in D. 


Proof: Suppose that D is noetherian, but there exists a non-unit 2 ~ 0 in 
D which cannot be expressed as a product of a finite number of irreducibles. 
Choose x so that (x) is maximal subject to these conditions on x, which 
is possible by the maximal condition. By its definition, this « cannot be 
irreducible, so 2 = yz where y and z are not units. Then (y) D (a) by 
4.4(a). If (y) = (x) then x and y are associates by 4.4(b) and this is not 
the case because it implies that z is a unit. So (y) 2 (x), and similarly 
(z) 2 (x). By maximality of (x), we must have 


Pl---Pr; 
Zz = Q1---s, 


where each p; and q; is irreducible. Multiplying these together expresses 
x as a product of irreducibles, a contradiction. Hence the assumption that 
there existed a non-unit 4 0 which is not a finite product of irreducibles is 
false, and factorization into irreducibles is always possible. O 


We are now in business because: 


Theorem 4.7. The ring of integers D in a number field K is noetherian. 


Proof: We prove that every ideal I of 0 is finitely generated. Now (9, +) 
is free abelian of rank n equal to the degree of K by Theorem 2.16. Hence 
(I, +) is free abelian of rank s < n by Theorem 1.16. If {a,... , zs} isa 
Z-basis for (I,+), then clearly (#,...,2,;) = I, so I is finitely generated 
and 0 is noetherian. oO 
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Corollary 4.8. Factorization into irreducibles is possible in D. 0 


To get very far in the theory, we need some easy ways of detecting units 
and irreducibles in 9. The norm proves to be a convenient tool: 


Proposition 4.9. Let D be the ring of integers in a number field K, and let 
z,y€O. Then 

(a) x is a unit if and only if N(x) = + 1, 

(b) If x and y are associates, then N(x) = + N(y), 

(c) If N(x) is a rational prime, then x is irreducible in O. 


Proof: (a) If zu =1, then N(x)N(u) = 1. Since N(x), N(u) € Z, we have 
N(az) = + 1. Conversely, if N(x) = + 1, then 


o1(2)o2(x)...¢n,(2) =+1 


where the o; are the monomorphisms K —> C. One factor, without loss in 
generality o1(x), is equal to 2; all the other o;(z) are integers. Put 


u = +02(Z)...0n(2). 


Then zu=1,sou=27!€ K. Henceue KN B=D, and z is a unit. 
(b) If z, y are associates, then x = uy for a unit u, so N(x) = N(uy) = 
N(u)N(y) = + N(y) by (a). 
(c) Let = yz. Then N(y)N(z) = N(yz) = N(x) = p, a rational prime; 
so one of N(y) and N(z) is tp and the other is + 1. By (a), one of y and 
zis a unit, so x is irreducible. O 


We have not asserted converses to parts (b) and (c) because these are 
generally false, as examples in the next section readily reveal. 


4.4 Examples of Non-Unique Factorization 
into Irreducibles 


We say that factorization in a domain D is unique if, whenever 


Pi---Pr=MN---Qs 


where every p; and q; is irreducible in D, it follows that 

(a)r=s, 

(b) There is a permutation 7 of {1,...,r} such that pj and g,(;) are 
associates for alli =1,... ,r. 
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In view of our earlier remarks about trivial factorizations, this is the best 
we can hope for. It says that a factorization into irreducibles (if it exists) 
is unique except for the order of the factors and the possible presence of 
units. Variation to this extent is necessary, since even in Z we have, for 
instance, 


8.5= 53 = (3-5) = 5) 3). 


Unfortunately, factorization into irreducibles need not be unique in a ring 
of integers of an algebraic number field. Examples are quite easy to come 
by if one looks in the right places, and to drive the point home we shall give 
quite a lot of them. They will be drawn from quadratic fields, and we state 
them as positive theorems. The easiest come from imaginary quadratic 
fields: 


Theorem 4.10. Factorization into irreducibles is not unique in the ring of 
integers of Q(vd) for (at least) the following values of d: —5, —6, —10, 
13, —14, —15, —17, —21, —22, —23, —26, —29, —30. 


Proof: In Q(./—5) we have the factorizations 
6 =2-3=(1+ V—5)(1— V—5). 


We claim that 2,3,1-+ /—5 and 1 — /—5 are irreducible in the ring 0 of 
integers of Q(./—5). Since the norm is given by 


N(a+ bV—5) = a? + 50? 


their norms are 4, 9, 6, 6, respectively. If 2 = xy where x,y € O are non- 
units, then 4 = N(2) = N(x)N(y) so that N(x) = +2, N(y) = +2. Similarly 
non-trivial divisors of 3 must, if they exist, have norm +3, whilst non-trivial 
divisors of 1 + /—5 must have norm +2 or +3. Since —5 # 1 (mod 4), the 
integers in D are of the form a + b/—5 for a,b € Z (Theorem 3.2) so we 
are led to the equations 


a?4+5b7=4+20r +3 (a,b€Z). 


Now |b| > 1 implies |a? + 5b? | > 5, so the only possibility is |b] = 0; but 
then we have a? = +2 or +3, which is impossible in integers. Thus the 
putative divisors do not exist, and the four factors are all irreducible. Since 
N(2) = 4, N(1 + /—5) = 6, by Proposition 4.9(b), 2 is not an associate of 
1+ /—5 or 1— V—5, so factorization is not unique. 

The other stated values of d are dealt with in exactly the same way 
(with a few slight subtleties noted at the end of the proof) starting from 
the following factorizations: 


84 4. Factorization into Irreducibles 


Q(iv-6): 6 =2-3 = =(v¥—6)(-v-6) 
Q(v-10): 14 =2-7 =(2+-/—10)(2—/—10) 
Q(V/—-13): 14 =2-7 =(1+—13)(1—- /—13) 
Q(vV—14): 15 =3-5 = (14+ V—14)(1-— V-14) 
Q(/—15): 4 =2-2 

3 


Q(v/-17): 18 =2-3-3 ah eeye os 


Q(./—21): 22 =2-11 =(1+ /—21)(1- /—2) 
Q(/—22): 26 =2-138 =(2++/—22)(2—/—22) 
Q(V—23): 6 =2°3 = (= (= 

Q(/—26): 27 =3-3-3 =(1+ /—26)(1— /—26) 
Q(/—29): 30 =2-3-5 =(1+/—29)(1— /—29) 
Q(/—-30): 34 =2-17 =(2+/—30)(2—/—30). 


Points to note are the following: In cases —15 and —23, note that 
d = 1 (mod 4) and be careful. For —26 it is easy to prove 3 irreducible. For 
1—./—26 we are led to the equation N(x)N(y) = 27, so N(x) = +9, N(y) = 
+ 3, or the other way round. This leads to the equations a? + 26b? = + 9 
or + 3. There is a solution for + 9, but not for + 3, and the latter is 
sufficient to show 1+ /—26 is irreducible. O 


Examining this list, we see that in the ring of integers of Q(./—17) there 
is an example to show that even the number of irreducible factors may differ; 
the case Q(./—26) shows that the number of distinct factors may differ and 
that even a (rational) prime power may factorize non-uniquely. 

For real quadratic fields there are similar results, but these are harder 
to find. Also, since the norm is a? — db’, it is harder to prove given numbers 
irreducible. With the same range of values as in Theorem 4.10 we find: 


Theorem 4.11. Factorization into irreducibles is not unique in the ring of 
integers of Q(Vd) for (at least) the following values of d: 


10, 15, 26, 30. 
Proof: In the integers of Q(V10) we have factorizations: 
6=2-3= (44 V10)(4— V10). 


We prove 2,3,4 + V10 irreducible. Looking at norms this amounts to 
proving that the equations 


a? — 106? = +2 or +3 
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have no solutions in integers a,b. It is no longer helpful to look at the size 
of |b|, because of the minus sign. However, the equation implies 


a®=+20r +3 (mod 10) 
or equivalently 
a? = 2,3,7 or 8 (mod 10). 


The squares (mod 10) are, in order, 0, 1, 4, 9, 6, 5, 6, 9, 4, 1; by a seem- 
ingly remarkable coincidence, the numbers we are looking for are precisely 
those that do not occur. Hence no solutions exist and the four factors are 
irreducible. Now 2 and 4+ J10 are not associates, since their norms are 
A, 6 respectively. 
Similarly we have: 
Q(v15): 10 =2-5 =(5+4 V15)(5— V15) 
Q(v26): 10 =2-5 =(6+/26)(6— 26) 
Q(v30): 6 =2-3 =(6+/30)(6— V30). 
The reader will find it instructive to do his (or her) own calculations. 
O 


The values of d considered in Theorems 4.10 and 4.11 have not, despite 
appearances, been chosen at random. If one attempts similar tricks with 
other d in the range —30 to 30, nothing seems to work. Thus in Q(./—19) 
we get 


(FB) (=) 


2 


but all this shows is that 5 is reducible. 
Trying another obvious product in the integers of Q(./—19), we find 


(2+ V—19)(2 — V/—19) = 23 
which just tells us that 23 is also reducible. The case 


ce) aaa 


shows 7 is reducible. After more of these calculations we may alight on 
35 = 5-7 = (44 V—19)(4-— V—19). 


Will this prove non-uniqueness? No, because neither 5 nor 7 is irreducible, 
as we have seen; and neither is 4+ /—19. 
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The complete factorization of 35 is 


CACC 


2 2 


and the two apparently distinct factorizations come from different group- 
ings of these in pairs. Eventually one is led to conjecture that the integers 
of Q(./—19) have unique factorization. This is indeed true, but we shall 
not be in any position to prove it until Chapter 10. In fact the ring of 
integers of Q(V4d) for negative squarefree d has unique factorization into 
irreducibles if and only if d takes one of the values: 


SP 20.287 Si 104367 169: 


Numerical evidence available in the time of Gauss pointed to this result. 
In 1934 Heilbronn and Linfoot [36] showed that at most one further value 
of d could occur, and that |d| had to be very large. In 1952 Heegner [35] 
offered a proof but it was thought to contain a gap. In 1967 Stark [69] 
found a proof, as did Baker [3] soon after. Finally Birch [5], Deuring [20] 
and Siegel [68] filled in the gap in Heegner’s proof. The methods of this 
book are not appropriate to give any of these proofs, but we will prove in 
Chapter 10 that for these nine values factorization is unique. 

The situation for positive d is not at all well understood. Factorization 
is unique in many more cases, for instance 2, 3, 5, 6, 7, 11, 13, 14, 17, 19, 
21, 22, 23, 29, 31, 33, 37, 38, 41, 43, 46, 47, 53, 57, 59, 61, 62, 67, 69, 71, 
73, 77, 83, 86, 89, 93, 94, 97, ... (these being all for d less than 100). It 
is not even known whether unique factorization occurs for infinitely many 
d>0. 

So far we have not proved uniqueness of factorization for the ring of 
integers in any number fields (apart from Z). In the next section we intro- 
duce a criterion which tells us when factorization is unique in terms of a 
special property of the irreducibles. 


4.5 Prime Factorization 


We have already noted that an irreducible p in Z satisfies the additional 
property 


plmn implies plm or pin. 


In this section we shall show that it is this property which characterizes 
uniqueness of factorization. 
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In a domain D an element z is said to be prime if it is not zero or a 
unit and 


zlab implies zla or 2|b. 


Note that the zero element satisfies the given property in a domain, but we 
exclude it to correspond with the definition of prime in Z, where 0 is not 
usually considered a prime. This convention allows us to state: 


Proposition 4.12. A prime in a domain D is always irreducible. 


Proof: Suppose that D is a domain, x € D is prime, and x = ab. Then 
z|ab, so za or xb. 
If zla, then a = xc (c € D), so 


x = xcb 
and cancelling x (which is non-zero), we see that 
l=cb 
and b is a unit. In the same way, x|b implies a is a unit. oO 


The converse of this result is not true, as Eisenstein lamented in 1844; 
in many domains there exist irreducibles that are not primes. For example 
in Z[/—5| we have 


6 =2-3= (1+ V-5)(1— v5), 


but 2 does not divide either of 1+ /—5 or 1— /—5 (as we saw in the 
proof of Theorem 4.10). So 2 is an irreducible in Z[,/—5], but not prime. 
The factorizations in the proofs of Theorems 4.10 and 4.11 readily yield 
other examples. The next theorem tells us that such examples are entirely 
typical—every domain with non-unique factorization contains irreducibles 
that are not prime: 


Theorem 4.13. In a domain in which factorization into irreducibles is 
possible, factorization is unique if and only if every irreducible is prime. 


Proof: Let D be the domain. It is convenient to rephrase the possibility 
of factorization for all non-zero x € D as 


£= Up... Dr 
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where wu is a unit and pj,...,p, are irreducibles. When r = 0 this can 
then be interpreted as x = u is a unit and when r > 1, then up, is an 
irreducible, so x is a product of the irreducibles upj, po,... , pr- 


Now for the proof. Suppose first that factorization is unique and p is 
an irreducible. We must show p is prime. 


If plab, then pe = ab (c € D). 


We need consider only the non-trivial case a £ 0, b 4 0 which implies c 4 0 
also. 
Factorize, a, b,c into irreducibles: 


U1P1--+Dn 
b = U2q---Qn 
Cc = U3T1...Ts 


where each wu; is a unit and p;, gq; and r; are irreducible. Then 


p(usri see es) = (u1p1 . - Pn) (u2gi sae Gm); 


and unique factorization implies p is an associate (hence divides) one of the 
pi Or qj, So divides a or b. Hence p is prime. 

Conversely, suppose that every irreducible is prime. We will demon- 
strate that if 


U1P1 »-- Dm = U291--- An (4.2) 


where 1, U2 are units and the p;, q;, are irreducibles, then m = n and 
there is a permutation 7 of {1,... , m} such that p; and g,(;) are associates 
(l<i<m). 

This is trivially true for m = 0. 

For m > 1, if (4.2) holds, then p,, |uegi... gn. But pm is prime, so (by 
induction on 7), pm|ue or pm|q; for some j. The first of these possibilities 
would imply p,, is a unit (by Proposition 4.3(a)), so we must conclude that 
Pm|q;- We renumber so that j = n, then pm|Qn and gn = Pmu where wu is a 
unit. So 


U1P1 ---Pm = U241 ---Qn—-1UPm 
and cancelling p,,, 
U1P1---Pm—1 = (uet)qi --- Qn—1- 


By induction we may suppose m — 1 = n — 1 and there is a permutation of 

1,... ,m—1 such that p;,q,(4) are associates (1 <7 < m—1). We can then 

extend 7 to {1,... ,m} by defining 7(m) = m to give the required result. 
0 
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A domain D is called a unique factorization domain if factorization into 
irreducibles is possible and unique. In a unique factorization domain all 
irreducibles are primes, so we may speak of a factorization into irreducibles 
as a ‘prime factorization’. Theorem 4.13 tells us that a prime factorization 
is unique in the usual sense. 

We can immediately generalize many ideas on factorization to a unique 
factorization domain. For instance, if a,b € D, then the highest common 
factor h of a,b is defined to be an element that satisfies 

(i) hla, hb, 

(ii) If h’|a, h’|b, then h’|h. 

If a is zero, the highest common factor of a,b is b. For a,b 4 0, ina 
unique factorization domain we can write 


@ = wypi'...per 
b= uap! ...ptn 


where u1,U2 are units and the p; are distinct (ie. non-associate) primes 

with non-negative integer exponents e;, f;. Then it is easy to show that 
h=upl vopy* 

where wu is any unit and mj, is the smaller of e;, fj (1 <i <n). The highest 

common factor is unique up to multiplication by a unit. We can say that 

a,b are coprime if their highest common factor is 1 (or any other unit). 

In the same way we can define the lowest common multiple | of a,b to 
satisfy 

(iii) all, Bll, 

(iv) If all’, bl’, then JI’. 

For non-zero a, b this is 
b= up* .. pee 
where k; is the larger of e;, f; in the factorizations noted above. 

Without uniqueness of factorization we can no longer guarantee the 
existence of highest common factors and lowest common multiples. (See 
Exercise 9 in this chapter.) The language of factorization of integers can 
only be carried over sensibly to a unique factorization domain. 

In the next section we shall see that if a domain has a property analogous 
to the division algorithm, then we can show that every irreducible is prime, 
so factorization is unique. In later chapters we shall develop more advanced 
techniques which will show unique factorization in a wider class of domains 
which do not have this property. 
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4.6 Euclidean Domains 


The crucial property in the usual proofs of unique factorization in Z or 
K[é] for a field K is the existence of a division algorithm. A reasonable 
generalization of this is the following: 

Let D be a domain. A Euclidean function for D is a function ¢ : 
D\ {0} > N such that 

(i) If a,b € D\{0} and a|b then ¢(a) < ¢(b), 

(ii) If a,b € D\ {0} then there exist g,r € D such that a = bg+r where 
either r = 0 or G(r) < (0). 

Thus for Z the function ¢(n) = |n| and for K[t], ¢(p) = Op (the degree 
of the polynomial p) are Euclidean functions. 

If a domain has a Euclidean function we call it a Euclidean domain. We 
shall prove that a Euclidean domain has unique factorization by showing 
that every irreducible is prime. The route is this: first we show that in 
a Euclidean domain every ideal is principal (a domain with this property 
is called a principal ideal domain), then we show that the latter property 
implies all irreducibles are primes. 


Theorem 4.14. Every Euclidean domain is a principal ideal domain. 


Proof: Let D be Euclidean, I an ideal of D. If I = 0 it is principal, so 
we may assume there exists a non-zero element x of I. Further choose x 
to make ¢(x) as small as possible. If y € J then by (ii) we have y= qr +r 
where either r = 0 or g(r) < ¢(x). Now r € I so we cannot have ¢(r) < 
¢(x) because ¢(x) is minimal. This means that r = 0, so y is a multiple of 
x. Therefore I = (2) is principal. O 


Theorem 4.15. Every principal ideal domain is a unique factorization 
domain. 


Proof: Let D be a principal ideal domain. Since this implies D is 
noetherian, factorization into irreducibles is possible by Theorem 4.6. To 
prove uniqueness we show that every irreducible is prime. 

Suppose p is irreducible, then (p) is maximal amongst the principal 
ideals of D by Theorem 4.4(d), but since every ideal is principal, this 
means that (p) is maximal amongst all ideals. 

Suppose plab but p { a. The fact that p { a implies (p,a) 2 (p), so by 
maximality, (p,a) = D. 

Then 1 € (p,a), so 


l=cp+da (c,deD). 
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Multiplying by 6b yields 
b = cpb+ dab 


and since plab, we find p|(cpb + dab), so p|b. This shows p to be prime and 
completes the proof. Oo 


Theorem 4.16. A Euclidean domain is a unique factorization domain. 
Oo 


4.7 Euclidean Quadratic Fields 


(This subsection may be omitted if desired.) 


In order to apply Theorem 4.16 it is necessary to exhibit some number 
fields for which the ring of integers is Euclidean. We restrict ourselves to 
the simplest case of quadratic fields Q(Vd) for squarefree d, beginning with 
the easier situation when d is negative. 


Theorem 4.17. The ring of integers D of Q(Vd) is Euclidean for d = 
—1, —2,—3,—7,—11, with Euclidean function 


(a) = |N(a)]. 


Proof: To begin with we consider the suitability of the function ¢ defined 
in the theorem. For this to be a Euclidean function, the following two 
conditions must be satisfied for all a, 3 € D \ 0: 

(a) If a|f then |N(a)| < [N(A)]. 

(b) There exist 7,5 € 0 such that a = Gy +6 where either 6 = 0 or 
IN(5)| < [N(@)]. 

It is clear that (a) holds, for if a|8 then @ = Aa for X € D and then 


IN(8)| = IN(a)| = IN(@)N(A)| = IN(@)||NQ)| 


with rational integer values for the various norms. 
To prove (b), we consider the alternative statement: 
(c) For any ¢ € Q(Vd) there exists « € O such that 


IN(e—«)| <1. 


We shall prove (c) is equivalent to (b). First suppose (b) holds. We know 
(Lemma, 2.11) that ce € O for some c € Z. Applying (b) with a = ce, 
B =c we get two possibilities: 
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(i) 6 = 0 and ce = cy for y € DO. Then « = y € O and we may take 
K=E. 
(ii) ce = cy + 6 where |N(6)| < |N(c)|. Now c 4 0, so this implies 


IN(5/c)| <1 
which is the same as 
IN(e—)| <1 


so we may take « = y. Hence (b) implies (c). To prove that (c) implies (b) 
we put « = a/G and argue similarly. 
This allows us to concentrate on condition (c), which is relatively easy 
to handle: in spirit it says that everything in Q(V/d) is ‘near to’ an integer. 
Suppose « = r+ sVd (r,s € Q). If d #1 (mod 4) we have to find 
K=a2+yvd (2,y € Z) with 


\(r — x)? —d(s — y)?| <1. 


For d = —1, —2 we may do this by taking z and y to be the rational integers 
nearest to r and s respectively, for then 


\(r — 2)? —a(s—y)?| < 


(3)? +2(3)"|=3<1. 


The remaining three values of d to be considered have d = 1 (mod 4). In 
this case we must find 


cnsty (14) (x,y € 2) 


such that 
(r — # — dy)? — d(s — 3y)?| <1. 


Certainly we can take y to be the rational integer nearest to 2s, so that 
|2s — y| < $; and then we may find x € Z so that |r — x — $y| < 5. For 
d= —3,—7, or —11 this means that 


\(r— 2 — dy) —ds— Jy] <|$+ BI=H <1. 
The theorem is proved. O 


To complete the picture for negative d we have: 


Theorem 4.18. For squarefree d < —11 the ring of integers of Q(V 4d) 48 
not Euclidean. 
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Proof: Let 0 be the ring of integers of Q(v 4d) and suppose for a con- 
tradiction that there exists a Euclidean function ¢. (We do not assume 
@ = |N|.) Choose a € D such that a ¥ 0, a is not a unit, and ¢(a) is 
minimal subject to this. Let 3 be any element of D. Now there exist y, 6 
such that 8 = ay+ 6 with 6 = 0 or $(6) < ¢(a). By choice of a the latter 
condition implies that either 6 = 0 or 6 is a unit. 

For d < —11 it is easy to see that the only units of Q(v 4d) are +1 
(Proposition 4.2). Hence for every 8 € 0 we have @ = —1,0, or 1 (mod 
(a)) and so |D/ (a)| < 3. 

Now we compute |0/ (a) | using Theorem 1.17. By Theorem 2.16 (0, +) 
is free abelian of rank 2. If d # 1 (mod 4) a Z-basis for (a) is {a, avd} 
since a Z-basis for D is {1, /d}. If a= a+ bVd (a,b € Z) the Z-basis for 
(a) is 


{a+ bV4d, db + avd}. 


Hence by Theorem 1.17 we have 


o/(i=|| 3 


= |a” — db?| = |N (a)|. 


Similar calculations apply for d = 1 (mod 4) with the same end result 
(These calculations are a special case of Corollary 5.10). It follows that 
|N(a)| < 3. Thus if d #1 (mod 4) we have |a? — db?| < 3 with a,b € Z. If 
d=1 (mod 4) then a = A/2, b= B/2, for A, B € Z; and then |A? —dB?| < 
12. Since d < —11 the only solutions are a = + 1, b= 0; so |N(a@)| = 1 and 
hence @ is a unit. This contradicts the choice of a. O 


These two theorems together show that for negative d the ring of inte- 
gers of Q(V4d) is Euclidean if and only if d = —1, —2, —3, —7,—11. Further, 
when it is Euclidean it has as Euclidean function the absolute value of the 
norm. For brevity call such fields norm-Euclidean. 

The determination of the norm-Euclidean quadratic fields with d pos- 
itive has been a long process involving many mathematicians. Dickson 
proved Q(V/d) Euclidean for d = 2,3,5,13 (mistakenly asserting there were 
no others); Perron added 6, 7, 11, 17, 21, 29 to the list; Oppenheimer, Re- 
mak, and Rédei added 19, 33, 37, 41, 55, 73. Rédei claimed 97 as well 
but this was disproved by Barnes and Swinnerton-Dyer. Heilbronn proved 
the list finite in 1934, and the problem was finished off by Chatland and 
Davenport [14] in 1950 (and Inkeri [39] in 1949, independently) who proved: 


Theorem 4.19. The ring of integers of Q(Vd), for positive d, is norm- 
Euclidean if and only if d = 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 
41, 55, 73. 
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We cannot prove this theorem here. A good survey of the problem and 
related questions, with references, is given by Narkiewicz [54]. 

Unlike the case d negative it is not known whether Q(Vd) can be Euclid- 
ean but not norm-Euclidean. In an interesting and readable paper Samuel 
[65] suggests that Q(/14) might have such properties. 


4.8 Consequences of Unique Factorization 


When the integers in a number field have unique factorization, we can 
carry over many arguments of the type used in the factorization of integers 
(taking a little care at first). For example, if the reader refers back to the 
proof of the statement of Fermat in Section 4.1 it will become clear that, 
since Z[,/—2] (the ring of integers in Q(./—2)) has unique factorization, 
then the intuitive ‘proof’ given there is, in fact, valid. Let us consider 
another example of the same sort of thing, again a statement of Fermat, 
which we prove: 


Theorem 4.20. The equation 
yrt4=23 (4.3) 
has only the integer solutions y= +11, z=5, or y= +2, z=2. 


Proof: First suppose y odd, and work in the ring Z|], which is a unique 
factorization domain (Theorem 4.17). Then Equation (4.3) factorizes as 


(2 + iy)(2 — iy) = 2°. 
A common factor a + ib of 2+ iy, 2 — iy is also a factor of their sum, 4, 
and difference, 2y, so taking norms 
a?+07|16, a? +b? |/4y?, 
implying 
a? + b7|4. 

The only solutions of this relation are a = + 1,5 =0, ora =0,b=+1, 
or a=+1, b= +1, none of which turn out to give a proper factor a + ib 
of 2+%y. Hence 2+ iy, 2 — ty are coprime. By unique factorization in 
Z|i] it follows that if their product is a cube then one is ea* and the other 


e— 163 where ¢ is a unit, and a, 3 € Z[i]. But the units in Z[é] are + 7, +1 
(Proposition 4.2), which are all cubes, so 


2+ ty = (a+ ib)? 
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for some a,b € Z. Taking complex conjugates, we find that 
2—iy = (a — ib). 
Adding the two equations, 
4 = 2a(a? — 3b?) 
so that 
a(a? — 3b?) = 2. 
Now a divides 2, so a = + 1 or + 2; and the choice of a determines b. It is 
easy to see that the only solutions are a = —1,b=+1, ora=2,b=+1. 
Then 
2° = ((a + ib)(a — ib))? = (a? +6’), 

so z= a? + b* = 2, 5 respectively. Then y? + 4 = 8, 125, so y = +2, +11. 
This gives the solutions with y = + 11 as the only ones for y odd. 

Now suppose y even, so that y = 2Y. Then z is even as well, say 
z= 2Z, and 

Y?+1= 22%. 
Then Y must be odd, say Y = 2k +1. The highest common factor of Y +1 
and Y — i divides the difference 2i = (1+1%)*. Now 1+ 1% divides Y +i and 
Y —i but (1+ 7)? does not, so the highest common factor of Y + i and 
Y—itis1+7. Now 
(1+4Y)(1—-2Y) = 273 


and the common factor 1+% occurs twice on the left (bearing in mind that 
1+iY =i(Y —i), 1—-iY = —i(Y +7)). Hence there must be a factorization 


1+éY =(1+1%)(a+ ib)? 
whence as before 
1 = (a+ b)(a? — 4ab + b?) 


soa=+1,b=0,o0ra=0,6b=+1. These imply y = + 2, which 
correspond to the other two solutions stated. Oo 
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4.9 The Ramanujan-Nagell Theorem 


We now give a more intricate and impressive example of how unique factor- 
ization properties of algebraic number fields are used to prove theorems on 
Diophantine equations. Using the uniqueness of factorization in Q(V-7) 
Nagell verified a conjecture of Ramanujan: 


Theorem 4.21. The only solutions of the equation 
2?4+7=2" 


in integers Z,n are: 


Proof: We work in Q(./—7), whose ring of integers has unique factoriza- 
tion by Theorem 4.17. Clearly a solution for z is odd and we will suppose 
x is positive. 

Assume first that n is even: then we have a factorization of integers: 


(2°/2 4. ¢)(2"/? — 2) = 7 
so that 
ar/2+¢—7, 22 2 =1, 


so 


gitn/2 = 8 


andn = 4,2 =3. 
Now let n be odd, and assume n > 3. We have the factorization into 
primes: 


(32). 


2 
Now z is odd, x = 2k+1, so x?7+7 = 4k? +4k + 8 is divisible by 4. Putting 
m =n — 2, we can rewrite the equation to be solved as 


eet 


gm 
4 
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so that 


CAE -eeyesy 


where the right-hand side is a prime factorization. Neither (1 + /—7)/2 
nor (1 — /—7)/2 is a common factor of the terms on the left because such 
a factor would divide their difference, /—7, which is seen to be impossible 
by taking norms. Comparing the two factorizations, since the only units 
in the integers of Q(./—7) are + 1, (Proposition 4.2), we must have 


za ET)" 


= 


from which we derive 


wae (27 


We claim that the positive sign cannot occur. For, putting (24=) = 
a, (==) = b, we have 


Then 
a’ =(1—b)? =1 (mod 6”) 
since ab = 2, and so 
a Sala) YP =a (mod b”) 
whence 
a=a—b_ (mod 0*), 


a contradiction. 
Hence the sign must be negative, and expanding by the binomial theo- 
rem we have 


(eG) (Gem 


so that 
—2™1=m (mod 7). (4.4) 
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Now 2° = 1 (mod 7) and it follows easily that the only solutions of Equation 
(4.4) are 


m = 3,5, or 13 (mod 42). 


We prove that only m = 3, 5,13 can occur. It suffices to show that we can- 
not have two solutions of the original equation that are congruent modulo 
42. So let m, m; be two such solutions, and let 7' be the largest power of 
7 dividing m — m,. Then 


aaa Sa Ay a aye (4.5) 
Now 
Qn = [Qn 1 (moa 74), 
and 
(1+ V—7)" "= 1+ (m1 —m)V—7 (mod 7'+1) 
(first raise to powers 7,77,... ,7', then (m—my,)/7'). Since 


a” = watt rl (mod 7) 


substituting in (4.5) gives 


a™ =g™4 TL" 7 (mod 7'+4) 


qm 
and 
pm = bm — os mv-7 ~~ (mod 7'+4), 
But 
a™ — 6" =a™ — §™ 
so 


(m—mi)V—7=0 = (mod 7'*1). 
Since m and m, are rational integers, 
m=m (mod 7'+1) 


which contradicts the definition of I. O 
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4.10 Exercises 


1. 


10. 


11. 


Which of the following elements of Z[?] are irreducible (¢ = /—1): 
1+4,3—7é, 5, 7, 124, 4454? 


. Write down the group of units of the ring of integers of: Q(/—1), 


Q(v—2), Q(v—3), Q(v—5), Q(v—6). 


. Is the group of units of the integers in Q(/3) finite? 
. Show that a homomorphic image of a noetherian ring is noetherian. 


. Find all ideals of Z which contain (120). 


Show that every ascending chain of ideals of Z starting with (120) 
stops, by direct examination of the possibilities. 


. Find a ring that is not noetherian. 
. Check the calculations required to complete Theorems 4.10, 4.11. 


. Is 10 = (3+ %)(3 — 7) = 2-5 an example of non-unique factorization 


in Z|i]? Give reasons for your answer. 


. Show that 6 and 2(1+./—5) both have 2 and 1+ /—5 as factors, 


but do not have a highest common factor in Z[/—5]. Do they have 
a least common multiple? (Consider norms.) 


Let D be any integral domain. Suppose an element x € D has a 
factorization 


v= UpPi---Pn 


where u is a unit and pj,... ,p, are primes. Show that given any 
factorization 


@ = VQ .--dm 


where v is a unit and qj,... ,@m are irreducibles, then m = n and 
there exists a permutation 7 of {1,...,n} such that p;, qn(;) are 
associates (1 <i <n). 


Show in Z[,/—5] that —5|(a + bV—5) if and only if 5|a. Deduce 
that /—5 is prime in Z[,/—5]. Hence conclude that the element 5 
factorizes uniquely into irreducibles in Z[,/—5] although Z[./—5] does 
not have unique factorization. 
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13. 


14. 
15. 
16. 


17. 


18. 


19 
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. Suppose D is a unique factorization domain, and a,b are coprime 
non-units. Deduce that if 


ab=c” 


for c € D, then there exists a unit e € D such that ea and e~1b are 
nth powers in D. 


Let p be an odd rational prime and ¢ = e?**/?. If a is a prime element 
in Z|¢], prove that the rational integers which are divisible by a are 
precisely the rational integer multiples of some prime rational integer 
q. (Hint: a|N(a), so a divides some rational prime factor g of N(a). 
Now show a is not a factor of any m € Z prime to q.) 


Prove that the ring of integers of Q(e?"*/5) is Euclidean. 
Prove that the ring of integers of Q(./2, i) is Euclidean. 


Let Q2 be the set of all rational numbers a/b, where a,b € Z and b 
is odd. Prove that Q» is a domain, and that the only irreducibles in 
Q2 are 2 and its associates. 


Generalize Exercise 16 to the ring Q,, where 7 is a finite set of 
ordinary primes, this being defined as the set of all rationals a/b with 
b prime to the elements of z. 


The following purports to be a proof that in any number field K the 
ring of integers contains infinitely many irreducibles. Find the error. 


‘Assume 9 has only finitely many irreducibles p,,... ,p,. The num- 
ber 1+ p1...Dn must be divisible by some irreducible g, and this 
cannot be any of pi,... , Pn. This is a contradiction. Of course the 
argument breaks down unless we can find at least one irreducible in 
; but since not every element of O is a unit this is easy: let x be 
any non-unit and let p be some irreducible factor of x.’ 


Hint: The ‘proof’ does not use any properties of D beyond the exis- 
tence of irreducible factorization and the fact that not every element 
is a unit. Now Qz has these properties ... 


. Give a correct proof of the statement in Exercise 18. 


D 


Ideals 


After the somewhat traumatic realization that uniqueness of factorization 
into irreducibles is unique in some rings of integers but not in others, we 
seek some way to minimize the damage. Kummer, and then Dedekind, 
took steps to develop more insightful theories. Kummer had the bright 
idea that if he could not factorize a number uniquely in a given ring of 
integers, then perhaps he could extend the ring to a bigger one in which 
further factorization might be not only possible, but unique. For example, 
we pointed out in chapter 4 that there are two factorizations 6 = 2-3 = 
J/-6 - /-6 in Z[V/—6], but /—6 does not divide 2 or 3 in this ring. In 
fact, 2/./—6 = ./—2/3, and 3/./—6 = ,/—3/2, neither of which belongs to 
Z|/—6]. Kummer’s idea: throw them into the pot to create a larger ring. 
He called the new elements introduced in this way ‘ideal numbers’. 
Dedekind looked at the same ideas from a different direction, mtroduc- 
ing the notion of an ‘ideal’ in ring theory, a term arising from the corre- 
spondence with Kummer’s ideal numbers. In simple ring-theoretic terms, 
Dedekind showed that although unique factorization may fail for numbers, 
an elegant theory of unique factorization can be developed for ideals. In this 
theory, the essential building blocks are ‘prime ideals’, which are defined 
by adapting the definition of a prime element from the previous chapter. 
Just as it is often easier to work with both a ring of integers and its corre- 
sponding field of quotients, we generalize the concept of ideal to ‘fractional 
ideal’; this generalization has the advantage that the non-zero fractional 
ideas form a group under multiplication. From this the uniqueness of fac- 
torization into prime ideals follows easily. Several standard consequences 
of unique factorization are easily deduced. We define the norm of an ideal 
as a generalization of the norm of an element and prove that the new norm 
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has the corresponding multiplicative property. We use this to show that 
every ideal can be generated by at most two elements. This tightens the 
noose around the neck of non-uniqueness of factorization. In the previous 
chapter we saw that factorization of elements is unique in a principal ideal 
domain (where every ideal is generated by a single element). We can now 
refine this result to show that factorization of elements into irreducibles is 
unique in a ring of integers if and only if every ideal is principal. 


5.1 Historical Background 


To motivate Kummer’s introduction of ‘ideal numbers’ and Dedekind’s re- 
formulation of this concept in terms of ‘ideals’, we shall look more closely 
at some examples of the failure of unique factorization, in the hope that 
some pattern may emerge. 

Many of our previous examples exhibit no obvious pattern, but others 
do seem to have significant features. For instance, consider: 


Q(vV15): 2-5 = (5+ V15)(5 — V5) 

Q(V30):  2-3= (6+ V30)(6 — V30) 

Q(V—-10): 2-7= (2+ /—10)(2 — V—10). 
In these we see a curious phenomenon: there is a prime p occurring on the 
left, and on the right a factor a+bv/d where a and d are multiples of p. One 
feels somehow that ,/p is a common factor of both sides, but ,/p does not 
lie in the given number field. As a specific case, consider the first example 
where V5 looks a likely candidate for a common factor, but V5 is not an 


element of Q(V15). Leaving aside the niceties at the moment, introduce 
5 into the factorization to get 


54+V15 = vV5(v54+ V3) 
5—V15 = vV5(v5— V3). 


Multiplying up and cancelling the 5, we get 
2= (v5 + V3)(v5 — V3). 


We can now see that the two given factorizations of 10 are obtained by 
grouping the factors in 


(/B)(VB)(v5-+ VB\v5— v9) 


in two different ways. 
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Perhaps by introducing new numbers, such as /5, we can restore unique 
factorization. Can our problem be that we are not factorizing in the right 
context? In other words, if factorization of some element in the ring of 
integers of the given number field K is not unique, can we extend K to a 
field L where it is? In our example to factorize the element 10 we extended 
Q(V15) to Q(V3, V5). What about the others? The factorizations of 14 
in Q(./—10) can be found by extending to Q(V/2, /—5) to get two possible 
groupings of the factors in 


14 = (V2)(V2)(V2 + V—5)(V2 — V—5). 
The case of 6 in Q(v30) is even more interesting: we have 


6 = (v2)(v2)(Vv3)(v3)(v6 + v5)(v6 — v5), 


and the last two factors are units. 

This is one way of viewing Kummer’s theory. One starts with a number 
field K and extends to a field L. Then Dx C Oy. Neither of these rings 
of integers need have unique factorization, but an element in Ox may 
factorize uniquely into elements in Oy. 

At the outset Kummer did not describe the theory in this way. His 
method involved detailed computations, which are described in Edwards 
[21, 22]. This involved him introducing the notion of ‘ideal’ prime factors 
for elements which may have no actual prime factors in Ox at all. These 
additional ‘ideal’ numbers may be interpreted as the elements introduced 
from Or for factorization purposes. 

A more appropriate formulation of the theory by Dedekind in terms 
of ideals (in the modern sense, though the origin of the name goes back 
to Kummer’s theory) has clarified matters. To motivate this approach, 
consider a factorization of an element 


x =ab 


in a ring R. Recalling from Chapter 1 that the product of ideals IJ is 
just the set of finite sums }> 2:y; (x; € I,y; € J), we see that the ideal 
generated by x is the product of the ideals generated by a and by b: 


(x) = (a) (6). 
More generally a product 
= Py ..<Pn 


of elements in R corresponds to a product of principal ideals 


(2) = (pi) --- (Pn) - 
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In considering uniqueness of factorization, the formulation in terms of ideals 
is marginally better, for if we replace pi by up; where wu is a unit, we find 
that the ideals (p1) and (up;) are the same (Proposition 4.4(b)). Thus if 
the factors are unique up to multiplication by units and order, the ideals 
(pi) ,--- , (Pn) are unique up to order. By passing to ideals we eliminate 
the problems introduced by units. 

How does this tie in with the earlier discussion? First consider the 
example 


10 = (v5)(V5) (v5 + V3)(V5 — V3) 
in the integers of Q(/3, V5). Let K = Q(V15), L = Q(V3, V5); then this 


factorization holds in the ring of integers O,. In this ring we also have the 
corresponding factorization of principal ideals: 


(10) = (v'5)(V5) (V5 + V3) (v5 — V3). 


We may intersect the ideals in this factorization with DO, and once more 
we get ideals in Ox, but now these ideals may not be principal. For 
instance, let J = (v5+ v3) NOK. Then V3(v5 + V3) = f/154+3e€ J, 
and /5(V5+ V3) =54+ V15 € J, so their difference 


(5+ V15) — (34+ V15) =2€1. 


If I were principal, say I = (a + bV/15), then 2 would be a multiple of 
a+ bv/15, and taking norms, 


a” — 15b?|4. 


Suppose that J is principal, say I = (k). Now N(5++/15) = 10 and 
N(3 + 15) = —6, so N(k)|2. We know that N(k) # + 1 since I is proper. 
If N(k) = + 2 then there exist a,b € Z with a? — 150? = + 2. But, taken 
modulo 5, this leads to a contradiction. So J is not principal. 

The moral is now clear. If we wish to factorize the principal ideal (x) 
in a ring of integers Ox, then we may get a unique factorization of ideals, 


(x) Sj seedas 


but the ideals I,,... , J, may not be principal. 

Factorization into ideals proves to be most useful, however; for the ideals 
in Ox are not far off being principal, having (as we shall see) at most two 
generators. 
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5.2 Prime Factorization of Ideals 


Throughout this chapter 9 is the ring of integers of a number field K of 
degree n. We use small Gothic letters to denote ideals (and later ‘fractional 
ideals’) of D. We are interested in two special types of ideal, which we define 
in a general situation as follows. Let R be a ring. Then an ideal a of R 
is maximal if a is a proper ideal of R and there are no ideals of R strictly 
between a and R. The ideal a 4 R of R is prime if, whenever 6 and c are 
ideals of R with bc Ca, then either b Caorec Ca. 

We can see where the latter definition comes from by considering the 
special case where all three ideals concerned are principal, say a = (a), 
b = (b), ¢ = (c). Since zly is equivalent to (y) C (x) (Proposition 4.4(a)), 
then the statement 


bc Ca implies either 6 Ca or cCa 


translates into 
albc implies either a|b or alc. 


If R is an integral domain, then the zero ideal is prime, and here we find 
(p) is prime if and only if p is a prime or zero. (See Exercise 5 in this 
chapter.) The fact that we exclude 0 from the list of prime elements but 
include (0) as a prime ideal is a quirk of the historical development of the 
subject. Elements came first, and 0 was excluded from the list of primes of 
Z. On the other hand, the definition we have given for a prime ideal allows 
us to make the following simple characterizations: 


Lemma 5.1. Let R be a ring and a an ideal of R. Then 
(a) a is maximal if and only if R/a is a field; 
(b) a is prime if and only if R/a is a domain. 


Proof: The ideals of R/a are in bijective correspondence with the ideals 
of R lying between a and R. Hence a is maximal if and only if R/a has no 
non-zero proper ideals. Now it is easy to show that a ring S has no non-zero 
proper ideals if and only if S is a field. Taking S = R/a proves (a). 

To prove (b), first suppose a is prime. If x,y € R are such that in R/a 
we have 


(a+ z)(at+y) =0 


and then xy € a, so (x) (y) C a. Hence either (x) C a or (y) Ca, so either 
x €aor y Ea. Hence one of (a+ x) or (a+ y) is zero in R/a, and therefore 
R/a has no zero-divisors so it is a domain. Conversely suppose R/a is a 
domain. Then |R/a| 4 1so a +4 R. Suppose if possible that be C a but 
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b ¢ a,¢ ¢ a. Then we can find elements b € b, c € ¢, with b,c ¢ a but 
bc € a. This means that (a+ b) and (a+c) are zero-divisors in R/a, which 
is a contradiction. Oo 


Corollary 5.2. Every maximal ideal is prime. 0 


Next we list some important properties of the ring of integers of a 
number field: 


Theorem 5.3. The ring of integers D of a number field K has the following 
properties: 
(a) It is a domain, with field of fractions K, 
(b) It is noetherian, 
(c) Ifa€ K satisfies a monic polynomial equation with coefficients in 
D thenaeé dO, 
(d) Every non-zero prime ideal of D is maximal. 


Proof: Part (a) is obvious. For part (b) note that by Theorem 2.16 the 
group (,+) is free abelian of rank n. It follows by Theorem 1.16 that if 
ais an ideal of O then (a, +) is free abelian of rank < n. Now any Z-basis 
for (a, +) generates a as an ideal, so every ideal of 9 is finitely generated 
and 9 is noetherian. Part (c) is immediate from Theorem 2.10. To prove 
part (d) let p be a prime ideal of D. Let 0# a€ p. Then 


N = N(a) = ay,...Qn Ep 


(the a; being the conjugates of a) since aj = a. Therefore (N) C p, and 
hence 0/p is a quotient ring of 9/NO which, being a finitely generated 
abelian group with every element of finite order, is finite. Since O/p is a 
domain by Lemma 5.1(b) and is finite, it is a field by Theorem 1.5. Hence 
p is a maximal ideal by Lemma 5.1 (a). 0 


The reader should note that part (d) of Theorem 5.3 is by no means 
typical of general rings. For example if we take R = Riz, y], the ring 
of polynomials in indeterminates x,y with real coefficients, then the ideal 
{x) is prime but not maximal because R/(x) = Rly] is a domain which 
is not a field. A ring which satisfies the conditions 5.3(a)—(d) is called a 
Dedekind ring after the mathematician who made ring-theoretic advances 
in this area. The proof of unique factorization of ideals which we shall give 
shortly will hold good in all Dedekind rings, although in applications we 
shall only require the special case when the ring is a ring 0 of integers in 
a number field. 
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To prove uniqueness we need to study the ‘arithmetic’ of non-zero ideals 
of 9, especially their behaviour under multiplication. Clearly this multipli- 
cation is commutative and associative with O itself as an identity. However, 
inverses need not exist, so we do not have a group structure. It turns out 
that we can capture a group if we spread our net wider. Note that an ideal 
may be described as an )-submodule of 9, so we look at D-submodules of 
the field K. The particular submodules of interest to give the group struc- 
ture we desire will turn out to be characterized by the following property: 
an 0-submodule a of K is called a fractional ideal of D if there exists some 
non-zero c € 9 such that ca C OD. In other words, the set 6 = ca is an ideal 
of D, and a = c~ 16; thus the fractional ideals of D are subsets of K of the 
form c~1b where b is an ideal of O and ¢ is a non-zero element of OD. (This 
explains the name.) 


Example 5.4. The fractional ideals of Z are of the form rZ where r € Q. 


Of course if every ideal of D is principal, then the fractional ideals are 
of the form c~1 (d) = c-1dO where d is a generator. By Theorem 5.3(a) 
this means the fractional ideals in a principal ideal domain 9 are just aD 
where a € K. The interest in fractional ideals is greater because D need 
not be a principal ideal domain. 

In general, an ideal is clearly a fractional ideal and, conversely, a frac- 
tional ideal a is an ideal if and only if a C 9. The product of fractional 
ideals is once more a fractional ideal. In fact, if ay = c, 1b, a2 = Cp 1b, 
where 6), bo, are ideals and c,, cg are non-zero elements of 9, then 
aid2 = (ciC2)~1bib2. The multiplication of fractional ideals is commu- 
tative and associative with 0 acting as an identity. 


Theorem 5.5. The non-zero fractional ideals of D form an abelian group 
under multiplication. 


It is convenient to prove this result along with the main theorem of the 
chapter: 


Theorem 5.6. Every non-zero ideal of D can be written as a product of 
prime ideals, uniquely up to the order of the factors. 


Proof: We shall prove Theorems 5.5 and 5.6 together in a series of steps. 
(i) Let a 4 0 be an ideal of D. Then there exist prime ideals p1,... , Pr 
such that p1...p, Ca. 
For a contradiction suppose not. Then since O is noetherian (Theorem 
5.3 (b)) we may choose a maximal, subject to the non-existence of such 
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p’s. Then a is not prime (since we could then take p; = a), so there exist 
ideals b, ¢ of O with be Ca, b Za, Za. Let 


a, =a+b, G2 =a+e. 


Then aja2 Ca, a1 2 4, a2 2 a. By maximality of a there exist prime ideals 
P1,--- Ps, Ps41,--- Pr such that 


pi-.-Pps Cai, 
Ps41---Pr C do. 
Hence 
Pi---Pr C aide Ca 
contrary to the choice of a. 


(ii) Definition of what will turn out to be the inverse of an ideal: 
For each ideal a of 0, define 


a t={re K\za CO}. 


It is clear that a~! is an D-submodule. If a 4 0 then for any c € a, 
c #0, we have ca“! C 9, so a7? is a fractional ideal. Clearly D C a“, so 
a =a Caa—!. From the definition we have 


aat=atacd. 


This means that the fractional ideal aa is actually an ideal. (Our aim 
will be to prove aa“! = 9.) A further useful fact for ideals p, a is that 
aC p implies 0 C p-! Cae}. 

(iii) If a is a proper ideal, thena~' 29. 

Since a C p for some maximal ideal p, whence p~! C a7, it is sufficient 
to prove p-! #£ D for p maximal. We must therefore find a non-integer in 
p 1. We start with any a € p, a #0. Using (i) we choose the smallest r 
such that 


pi-..pr C (a) 


for pi,...,P, prime. Since (a) C p and p is prime (remember maximal 
implies prime), some p; C p. Without loss of generality pj C p. Hence 
pi =p since prime ideals in 0 are maximal (Theorem 5.3 (d)) and further 


po...pr Z (a) 


by minimality of r. Hence we can find b € po...p, \ (a). But bp C (a) 
so ba-'p C DO and ba“! € p-?. But b ¢ aD and so ba! ¢ OD, whence 


pl #9. 
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(iv) If a is a@ non-zero ideal and aS C a for any subset S C K, then 
SCO. 

We must show that if a? C a for 6 € S, then 6 € D. Because 0 is 
noetherian, a = (a1,... ,@m), where not all the a; are zero. Then af C a 
implies 


bi1a, +... + 01mm 
ee or (b;; € D) 
bmia1 See bmm@m- 


a0 


Am 


As in Lemma 2.8 we deduce that because the equations 


(b11 — 0) 1 +...+ bim&m = 0 
bmiti +...+ (6mm — 9) tm = 0 
have a non-zero solution 21 = @),...,2%m = Gm, then the determinant of 


the array of coefficients is zero. This gives a monic polynomial equation in 
6 with coefficients in 0, hence @ € D by Theorem 5.3(c). (We remark that 
we could short-cut part of this proof by noting, as in the proof of Lemma 
2.8, that the b;; may be taken to be rational integers which gives 0 € O 
directly.) 

We are now in a position to take an important step in the proof of 
Theorem 5.5: 

(v) If p is a maximal ideal, then pp-1 = 9. 

From (ii), pp~1 is an ideal where p C pp-! C 9. Since p is maximal, 
pp? is equal to p or D. But if pp~! = 9, then (iv) would imply p-! C 9, 
contradicting (iii). So pp~! = 9. 

We can now extend (v) to any ideal a: 

(vi) For every ideala #0, aa~' = 9. 

If not, choose a maximal subject to aa“! 4 D. Then a C p where p is 
maximal. From (ii), 0 C p-' Ca, so 


aCap'Caa'co. 


In particular, ap~! C 9 implies ap! is an ideal. Now we cannot have 
a = ap"!, for that would imply p~! C O by (iv), contradicting (iii) once 
more. So a S ap7' and the maximality condition on a implies the ideal 


ap—! satisfies Be et 
ap-*(ap-*) = 9. 
By the definition of a~! this means 


pot (ap-1)~* C gok 
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Thus 
9 =ap- (ap-)"" Can? CO 


from which the result follows. 


(vii) Every fractional ideal a has an inverse a~: such that aa~1 = 0. 

The set ¥ of fractional ideals is already known to be a commutative 
semigroup, so given a fractional ideal a, we only need to find another frac- 
tional ideal a’ such that aa’ = 9, then a’ will be the required inverse. But 
there exists an ideal 6 and a non-zero element c € 9 such that a = c~1b. 
Let a’ = cb—!, then aa’ = D as required. 

This, of course, proves Theorem 5.5. 

(viii) Every non-zero ideal a is a product of prime ideals. 

If not, let a be maximal subject to the condition of not being a product 
of prime ideals. Then a is not prime, but we will have a C p for some 
maximal (hence prime) ideal, and as in (vi), 


aGap* Co. 


By the maximality condition on a, 


ap "= pe pr 
for prime ideals po2,... ,p,, whence 
a= Ppo...Pr. 


(ix) Prime factorization is unique. 

By analogy with factorization of elements, for ideals a, 6 we shall say 
that a divides b (written a|b) if there is an ideal ¢ such that b = ac. This 
condition is equivalent to a D 6 since we may then take c = a~/b. The 
definition of prime ideal p shows that if p|ab then either pla or p|6. If we 
now have prime ideals p1,... , Pr, qi,--. ,qs with 


P1---Pr = 41---4s, 


then p; divides some q;, so by maximality p; = q;. Multiplying by bi 
and using induction we obtain uniqueness of prime factorization up to the 
order of the factors. 

This proves Theorem 5.6. O 


In fact, the fractional ideals also factorize uniquely if we allow negative 
powers of prime ideals. Namely, if a is a fractional ideal with 0 4c EO 
such that ca is an ideal, we have 


(c) =pi---Pr, ca=q1.--qs, 
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and hence 


Capp waite qa 


One result in the proofs of Theorems 5.5 and 5.6 which is worth isolating 
occurs in step (ix): 


Proposition 5.7. For ideals a, 6 of D, 
alé if and only ifa Dd 6b. oO 
This tells us that in D the factors of an ideal 6 are precisely the ideals 


containing 6. The definition of a prime ideal p also translates into a notation 
directly analogous to that of a prime element: 


plab implies pla or plb. 


An extended worked example. The factorization of the ideal (18) in 


Z(V-17]. 


From Theorem 4.10 we have the factorization of elements: 


18=2-3-3=(1+ V—17)(1 — vV—17). 


Consider the ideal p; = (2; l+v —17) whose generators are both factors 
of 18. Clearly 18 € p, so (18) C p1, which means that p; is a factor of 
(18). In fact we also have 


1— /-17 = 2- (1+ V—17) € py 


sO 


18 = (1+ /—17)(1 — V—17) € p? 


which means that (18) C p? and p? is a factor of (18). Now the elements 
of p; are of the form 


2(a + b¥—-17) + (1+ V—-17)(e+ dv-17) 
= (2a+c¢—17d) + (2b+c¢+d)V-17 
=r+s/—-17 


where 
r — s = 2a — 2b — 18d, 


112 5. Ideals 


which is always even. Clearly r may be taken to be any integer and then 
$s may be any integer of the same parity (odd or even). This implies 1 
is not the whole ring Z[,/—17]. On the other hand, p; is maximal, for if 
m-+n/—17 is any element not in p; then one of m, n is even and the other 
odd, so 


(pi1,m+nvV-17) = Z[V—17] . 


Similarly, considering 


po — (3, 1 +v —17) ; 
we find that an element of p2 is of the form 
r+ s/—17 = (3a + c—17d) + (3b +c+d)V-17 


where r — s = 3(a + b— 6d). Thus r, s can be any integers subject to the 
constraint 


r=s (mod 8). 


Once more we find pz maximal and 18 = 2-3-3 € p32, so p? is a factor 
of (18). 
Finally, considering 


ps = (3,1— v=), 


we get another prime ideal such that p3 is a factor of (18), and a calculation 
similar to the previous ones shows that r + s./—17 € ps if and only if 


r+s=0 (mod 3). 
Using the factorization theory of Theorem 5.6, we find that 
Pipaps 2 (18). 


The final step, to show that (18) = p?p2p2, is best performed using a 
counting argument. Since every element in Z[\/—17] is either in p, or of 
the form 1+ 2 for x € 1, the number of elements in the quotient ring 
Z[-17\/p: is 


IZ[V=T7I/p:| = 2. 
Similarly 


|Z[V—17]/pr]|=3 (r= 2,3). 
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In the next section we shall call 


|D/p| 


the norm of the ideal p and write it as N(p). The crucial property of this 
new type of norm is that it is multiplicative, 


N(ab) = N(a)N(b). 
Granted this fact, we can deduce 
N(p?p3e2) = 2? -3?- 9? = 182 
Now the norm of the ideal (18) is 
N((18)) = |Z[v—17]/ (18) | 
and since every element of Z[/—17] is uniquely of the form 
a+ b/_—17 +2 


where a, b are integers in the range 0 to 17 and x € (18), we find 18 choices 
each for a, b so 


N(({18)) = 18?. 
Suppose (18) factorizes as 
(18) = prpap3a 


for some ideal a. Then taking norms and using the multiplicative property, 
we find N(a) = 1, whence a is the whole ring and 


(18) = pip3p3. (5.1) 
If we consider the factorization of elements 18 = 2-3-3, we obtain 
(2) (3)? = pip3p3. (5.2) 


By the uniqueness of factorization of ideals, both (2), (3) are products of 
prime ideals from the set {p1,p2,p3}. Now 2 € p, but 2 ¢ po, 2 ¢ ps3, so 


p1| (2), po { (2), ps ¢ (2), thus 
(2) = pi. 
Similarly 3 ¢ pi, 3 € po, 3 € p3 implies 


(3) = pop3- 
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Substituting in Equation (5.2) gives 
pip"p3” = PiPaPs, 
and uniqueness of factorization of ideals implies 


q=2, r=s=1, 


(2)=p7, (3) = pops. (5.3) 


(The reader may find it instructive to check these by direct calculation.) 
A similar argument using 


(18) = (1+ V—17) (1 — V—17) = pip3p3 (5.4) 
where 1+ ./—17 € pi, po; 1+. V—17 ¢ p33 1— V—17 © $1, 3; 1— V/—17 € po 


gives 
(1+ V-17) = pips, (1— V—17) = pips. 
Substituting in Equation (5.4) implies m =r =1, n= s = 2, so 
(1+ V-17) = pip3, (1 — V—17) = pip}. (5.5) 


From Equations (5.3) and (5.5) we see that the two alternative factor- 
izations of the element 18 come from alternative groupings of the ideals: 


(18) = (p?) (pops)? = (2) (3)? 
(pip2) (pip2) = (1 + V—17) (1 — V—17). 


We shall consider the norm of an ideal and its multiplicative property 
in the next section, once we have dealt with some simple consequences of 
unique factorization. Later on we shall develop certain other properties of 
the norm which will help streamline the calculations in the above example. 


II 


5.3 The Norm of an Ideal 


Once unique factorization is proved, several useful consequences follow in 
the usual way. In particular, any two non-zero ideals a and 6 have a 
greatest common divisor g and a least common multiple | with the following 
properties: 


gla, gl6; andif gg’ has the same properties g’|g; 
all, bl; andif has the same properties [|I’. 
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In fact, suppose we factorize a and 6 into primes as: 


a=|[v#, b=] rh 


with distinct prime ideals p;. Then we clearly have 


g= I pimin(es fi) 


{= Le: 


We have useful alternative expressions: 


Lemma 5.8. If a and 6 are ideals of D, and g, | are the greatest common 
divisor and least common multiple, respectively, of a and b, then 


g=a+b, (=anb. 


Proof: We know that rla if and only if ry D a (Proposition 5.7). Hence 
g must be the smallest ideal containing a and b, and ¢ the largest ideal 
contained in a and b. The rest is obvious. O 


The proof of Theorem 5.3 shows that if a is a non-zero ideal of D then 
the quotient ring 0/a is finite. We define the norm of a to be 


N(a) = |9/al. 


Then N(a) is a positive integer. There is no reason to confuse this norm 
with the old norm of an element N(a) since it applies only to ideals. In fact 
there is a connection between the two norms, as we shall see in a moment. 


Theorem 5.9. 

(a) Every ideal a of O with a £0 has a Z-basis {a1,...,a,} where n 
is the degree of K; 

(b) We have 


1/2 


N(a) = 


A [a1,.-- Qn] 
A 


where A is the discriminant of K. 


Proof: We know from Theorem 2.16 that (0,+) is free abelian of rank 
n. Since 0/a is finite it follows from Theorem 1.17 that (a,+) is free 
abelian of rank n, hence has a Z-basis of the form {a1,...,a,}. This 
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proves (a). For part (b) let {w1,...,wp} be a Z-basis for DO, and suppose 
that a; = }> ¢jj;w;. Then by Theorem 1.17, 


N(a) = |O/a| = | det ¢;;|. 


But by the formula before Theorem 2.7 we have 


(det ¢;;)7 A [w1,... nl 
(N(a))? A. 


Mieinascel 
Taking square roots and remembering that N(a) is positive we obtain the 
desired result. O 
Corollary 5.10. If a = (a) is a principal ideal then N(a) = |N(a)|. 


Proof: A Z-basis for a is given by {aw ,... ,aw,}. The result follows from 
the definition of Ala1,... ,a@,] and Theorem 5.9. 


This corollary helps us to make a straightforward calculation of the 
norm of a principal ideal. 


Example 5.11. If © is the ring of integers of Q(V/d) for a square-free 
rational integer d, then 


N((a+bvd)) = |a? — bd|, 
in particular, in 0 = Z[,/—17], then 


N ({18)) = 187. 


The new norm, like the old, is multiplicative: 


Theorem 5.12. If a and b are non-zero ideals of D, then 
N(ab) = N(a)N(6). 


Proof: By uniqueness of factorization and induction on the number of 
factors, it is sufficient to prove 


N(ap) = N(a)N(p) (5.6) 


where p is prime. 
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We establish 
|D/ap| = |D/a| |a/ap| (5.7) 
and 
|a/ap| = |0/p|. (5.8) 


Then Equation (5.6) follows immediately from Equation (5.7), Equation 
(5.8), and the definition of the norm. 
Equation (5.7) is a consequence of the isomorphism theorem for rings: 


Define ¢: D/ap + D/a by 
o(ap+2)=a+z 


then ¢ is a surjective ring homomorphism with kernel a/ap; Lagrange’s 
theorem (applied to the additive groups) gives Equation (5.7). 

To establish Equation (5.8), first note that unique factorization implies 
a ap, soa 2 ap. Now we show that there is no ideal 6 strictly between a 
and ap, for if 

a2 bd ap, 
then, as fractional ideals, 
ataDa‘bDa ap, 


sO 
O Da bDp. 


Since a~!b C 9, we see that it is actually an ideal, and since p is maximal 
(by Theorem 5.3 (d)), we have 


a ‘b=9 or alb=p 


so 
b=a or b=ap. 


This means that for any element a € a\ap, we have 
ap + (a) =a. (5.9) 
Fix such an a and define 0: D — a/ap by 
O(xz) = ap +az, 


then 6 is an O-module homomorphism, surjective by Equation (5.9), whose 
kernel is an ideal satisfying 


p C keré. 
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Now ker 6 # O (for that would mean a/ap = D/ ker 6 = 0, which would 
contradict a # ap), and p is maximal, so 


ker @ = p. 


Hence 0/p ~ a/ap (as O-modules), which gives Equation (5.8) and com- 
pletes the proof. O 


Example 5.13. IfD = Z[/—I7], pi = (2,1+ V—17) , po = (3,1+ V—17), 
p3 = (3, 1- v-17), then 


N(pip3p3) = 2? - 3? - 3? = 187, 


This particular calculation completes the details of the extended example 
in the previous section. 


It is convenient to introduce yet another usage for the word ‘divides’. 
If a is an ideal of D and 6 an element of O such that a|(b), then we also 
write a|b and say that a divides b. It is clear that ab if and only if b € a; 
however, the new notation has certain distinct advantages. For example, 
if p is a prime ideal and p| (a) (b), then we must have p| (a) or p| (b). Thus 
for p prime, 


plab implies pla or p|d. 


This new notation allows us to emphasize the correspondence between fac- 
torization of elements and principal ideals which would otherwise be less 
evident. 


Theorem 5.14. Let a be an ideal of D,a #0. 
(a) If N(a) is prime, then so is a. 
(b) N(a) is an element of a, or equivalently a|N(a). 
(c) If a is prime it divides exactly one rational prime p, and then 
N(a) = p™ 
where m <n, the degree of K. 


Proof: For part (a) write a as a product of prime ideals and equate norms. 
For part (b) note that since N(a) = |O/a| it follows that for any « € D we 
have N(a)z € a. Now put x = 1. For part (c) we note that by part (b) 


a|N(a) = py”...pyer 
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so (considering principal ideals in place of the p;) we have alp; for some 
rational prime p;. If p and q were distinct rational primes, both divisible 
by a, we could find integers u, v such that up + vq = 1, and then deduce 
that a|l1, which implies a = 9, a contradiction. Then 


N(a)|N((p)) = p” 


so that N(a) = p™ for some m < n. 0 


Example 5.15. If 0 = Z[/-17], pi = (2, 1+ v-17), then because 
N(p1) = 2 we can immediately deduce that pi is prime. Note that N(p1) = 
2 € pi, as asserted by Theorem 5.14 (b). 


Example 5.16. A prime ideal a can satisfy 
N(a) = p™ 


where m > 1, which means that a prime ideal does not necessarily have a 
norm which is prime; for instance 0 = Z[i], a = (3). Here 3 is irreducible 
in Z[i], hence prime because Z[i] has unique factorization. It is an easy 
deduction (Exercise 1 at the end pf this chapter) that if an element is 
prime, so is the ideal it generates. Hence (3) is prime in Zi], but 


N ({3)) = 3?. 
The next theorem collects together several useful finiteness assertions: 


Theorem 5.17. (a) Every non-zero ideal of D has a finite number of 
divisors, 

(b) A non-zero rational integer belongs to only a finite number of ideals 
of O, 

(c) Only finitely many ideals of D have given norm. 


Proof: (a) is an immediate consequence of prime factorization, (b) is a 
special case of (a), and (c) follows from (b) using Theorem 5.14 (b). O 


Example 5.18. Consider our earlier calculation 
(18) = pipap3 


in Z[/—17] where p1 = (2,1+/-17), po = (3,14 /-—17), and 
p3 = (3,1 — V—17). We find the only prime divisors of (18) are 1, p2, Ps. 
If 18 belongs to some ideal a, then (18) C a, whence a| (18), so alp?p2p2 
and a = piphp§ where g, r, s are 0, 1 or 2. Thus 18 belongs only to a finite 
number of ideals. 
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How many ideals a have norm 18? This can only happen when a|18 by 
Theorem 5.14 (b), so j 
a = pipops 
which implies 
N(a) = 27-3" - 3°. 


This norm is 18 only when g = 1 and r+ s = 2, which means that a is 
P1P3, PiPaps or pip3. 

We know that every ideal of 9 is finitely generated. In fact, we shall 
prove that two generators suffice. 


Lemma 5.19. Ifa, 6 are non-zero ideals of D then there exists a € a such 


that 
aa t+b=9. 


Proof: First note that if a € a we have aja so that aa”! is an ideal and 
not just a fractional ideal. Now aa! + 6 is the greatest. common divisor 
of aa! and 6, so it is sufficient to choose a € a so that 


aat+p=9 (i=1,...,7r) 


where p,... ,p, are the distinct prime ideals dividing 6. This will follow 
if 
pi { aa? 

since p; is a maximal ideal. So it is sufficient to choose a € a \ ap, for all 
pels 4P 

If r = 1 this is easy, for unique factorization of ideals implies a # ap,. 
For r > 1 let 

GQ; = OP, ... pi-1Pi41--- Pr. 


By the case r = 1 we can choose 
a Ea; \ api. 


Define 
a@a=ayt+...+Q,. 


Then each a; € a; C a, soa € a. Suppose if possible that a € ap,;. If 7 Fi 
then a; € a; € ap;, so it follows that 


A= A-— Ay —...— A_1 — Aj41 —-.--— Ap E ap,;. 


Hence ap,|(a;). On the other hand a;|(a;). We have a;p,|(a;). This 
contradicts the choice of a;. 
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Theorem 5.20. Let a 4 0 be an ideal of D, and0 #4 B € a. Then there 
exists a € a such that a = (a, {). 


Proof: Let 6 = Ga-!. By Lemma 5.19 there exists a € a such that 
aa ++b=aa!+fa1=9, 


hence 


((a) + (8))a~* = 9, 


so that 
a= (a) + (8) = (a, A). 2 


This theorem demonstrates that the experience of the earlier extended 
example, where each ideal considered had at most two generators, is entirely 
typical of ideals in a ring of integers of a number field. 


We are now in a position to characterize those D for which factorization 
of elements into irreducibles is unique: 


Theorem 5.21. Factorization of elements of D into irreducibles is unique 
if and only if every ideal of D is principal. 


Proof: If every ideal is principal, then unique factorization of elements 
follows by Theorem 4.15. To prove the converse, if factorization of elements 
is unique, it will be sufficient to prove that every prime ideal is principal, 
since every other ideal, being a product of prime ideals, would then be 
principal. Let p 4 0 be a prime ideal of D0. By Theorem 5.14 (b) there 
exists a rational integer N = N(p) such that p|N’. We can factorize N as a 
product of irreducible elements in D, say 


N=7]...%s5. 


Since p|N and p is a prime ideal, it follows that p|7;, or equivalently, p| (7;). 
But factorization being unique in 0, the irreducible z; is actually prime 
by Theorem 4.13, and then the principal ideal (7;) is prime (Exercise 1 at 
the end of the chapter). Thus p| (7;), where both p, (z;) are prime, and by 
uniqueness of factorization, 

p= (mi) ’ 


so p is principal. O 


Using this theorem we can nicely round off the relationship between 
factorization of elements and ideals. To do this, consider an element 7 
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which is irreducible but not prime. Then the ideal (7) is not prime, so has 
a proper factorization into prime ideals: 


(t) = P1---Pr- 


Now none of these p; can be principal, for if p; = (a), then (a) | (), implying 
alz. Since w is irreducible, a would either be a unit (contradicting (a) 
prime), or an associate of 7, whence (7) = p;, contradicting the fact that 
(x) has a proper factorization. 

Tying up the loose ends, we see that if D has unique factorization of ele- 
ments into irreducibles, then these irreducibles are all primes; and factoriza- 
tion of elements corresponds precisely to factorization of the corresponding 
principal ideals. On the other hand, if D does not have unique factoriza- 
tion of elements, then not all irreducibles are prime, and any non-prime 
irreducible generates a principal ideal which has a proper factorization into 
non-principal ideals. We may add in the latter case that such non-principal 
ideals have precisely two generators. 


Example 5.22. In Z[,/—17], the elements 2, 3 are irreducible (proved by 
considering norms) and not prime, with 

(2) = (2,14+v=17) 

(3) (3,1 + V—17) (3,1 — v—17). 


5.4 Nonunique Factorization in Cyclotomic Fields 


We mentioned in the introductory section that unique prime factorization 
fails in the cyclotomic field of 23rd roots of unity. (The failure, rather 
than the precise value n = 23, is the crucial point!) In this (optional) 
section we use the tools developed in this chapter to demonstrate this 
result. The calculations are somewhat tedious and have been abbreviated 
where feasible: the energetic reader may care to check the details. A few 
tricks, inspired by the structure of the group of units of the ring Z23, are 
used; but we lack the space to motivate them. For further details see the 
admirable book by Edwards [22]. 

Let ¢ = e?7*/23, and let K = Q(¢). By Theorem 3.5 we know that the 
ring of integers Ox is Z[¢]. The group of units of Zo3 is generated by —2, 
whose powers in order are 


12-94 8, 14 Bes (5.10) 
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For reasons that will emerge later we introduce two elements 


9 = €4+04074+O%H..., 
6) Se Ces 


The powers that occur are alternate elements in the sequence (5.10). We 


have 
Ota =C+OP+...+0% =—1, 


9071 = 6. 


The norm of a general element f(¢), with f a polynomial over Z of degree 
< 22, can be broken up as 


[[y7@ 


j=l 


= JJ nrc): [[ nrc) 


j even j odd 


= G(C’)G(¢-*) (5.11) 


N(f(¢)) 


where 


GD) = FQ AFC YFP YF(O EC FCF FCI FC) FC). 


By definition, G(¢) is invariant under the linear mapping a sending ¢? 
to (7, But it is easy to check that an element fixed by a must be of the 
form a+ b6o for a, b € Z. (Either use Galois theory, or a direct argument 
based on the linear independence over Q of {1,¢,... ,¢?7}.) 

We pull out of a hat the element 


p=1—¢€4+C2=1-64¢7, 


which Kummer found by a great deal of (fairly systematic) experimenta- 
tion. Using (5.10) above and a lot of paper and ink we eventually find 
that 


N(u) = (—31 + 2809)(—31 + 286,) = 6533 = 47.139. 


By Theorem 5.14 the principal ideal m = (yu) cannot be prime, hence it 
must be nontrivial product of prime ideals, say 


m = pq. 


Taking norms we must (without loss of generality) have N(p) = 47, N(q) = 
139. If K has unique factorization then every ideal, in particular p, is 
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principal by Theorem 5.21. Hence p = (v) for some v € Z[¢]. Clearly 
N(v) = + 47 by Corollary 5.10. 

We claim this is impossible. We have already observed that G(¢) can 
always be expressed in the form a+ b (a,b € Z); and then G(¢~*) must 
be equal to a + b0,. Hence, setting f(¢) = v, we get 


+47 = (a + b0)(a + 00,) = a? — ab + 6b. 
Multiplying by 4 and regrouping, we find that 
(2a — b)? + 2367 = + 188. 


The sign must be positive. A simple trial-and-error analysis (involving 
only two cases) shows that 188 cannot be written in the form P? + 23Q?. 
This contradiction establishes that prime factorization of elements cannot 
be unique in Ox. 


5.5 Exercises 


1. In an integral domain D, show that a principal ideal (p) is prime if 
and only if p is a prime or zero. 


2. In Z[./—5], define the ideals 


p (2,1+ v5), 
q (3,1+ V—5), 
t = (3,1—¥v-5). 


Prove that these are maximal ideals, hence prime. Show that 


p= (2), qt = (3) 
pq = (1+ V-5), pe = (1— y-). 


Show that the factorizations of 6 given in the proof of Theorem 4.10 
come from two different groupings of the factorization into prime 
ideals (6) = p?qr. 


3. Calculate the norms of the ideals mentioned in Exercise 2 and check 
multiplicativity. 


4. Prove that the ideals p, q, t of Exercise 2 cannot be principal. 


5. Show the principal ideals (2), (3) in Exercise 2 are generated by 
irreducible elements but the ideals are not prime. 
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10. 
11. 


12. 
13. 
14. 


. In Z[—6] we have 


6 =2-3= (vV—6)(—v—-6). 
Factorize these elements further in the extension ring Z[/2, /—3] as 
= (-1)V2V2V-3v-3. 


Show that if J; is the principal ideal in Z| V2, / —3] generated by V2, 
then 


pi = 31 N Z[V—6] = (2, /—6). 


Demonstrate that p; is maximal in Z[,/—6], hence prime; and find 
another prime ideal p2 in Z[,/—6] such that 


(6) = pips. 


. Factorize 14 = 2-7 = (2+ /—10)(2 — V—10) further in Z[/—5, V2] 


and by intersecting appropriate ideals with Z[,/—10], factorize the 
ideal (14) into prime (maximal) ideals in Z[/—10]. 


. Suppose p, q are distinct prime ideals in 0. Show p+q = 9 and 


PNG = pq. 


. If O is a principal ideal domain, prove that every fractional ideal is 


of the form {ad¢d|a € DO} for some ¢ € K. Does the converse hold? 
Find all fractional ideals of Z and of Z[/—1]. 


In Z[/—5], find a Z-basis {a, a2} for the ideal (2,1 + /—5). Check 
the formula 


Ala, ag] ua 


N (2,1 + ¥=8)) =| 


of Theorem 5.9. 
Find all the ideals in Z[,/—5] which contain the element 6. 
Find all the ideals in Z[/2] with norm 18. 


If K is a number field of degree n with integers 0, show that ifm € Z 
and (m) is the ideal in D generated by m, then 


N((m)) = |m|". 
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15. In Z[,/—29] we have 
30 = 2-3-5 = (1+ V—29)(1 — V—29) 
Show 
(30) C (2,1 + V—29) 


and verify p,; = (2, 1+ V—29) has norm 2 and is thus prime. Check 
that 1 — /—29 € pi and deduce (30) C p?. Find prime ideals 
po, 5,)3,p3 with norms 3 or 5 such that 


(30) € pip} (= 2,8). 


Deduce that p?pop)p3p4| (30) and by calculating norms, or otherwise, 
show that 


(30) = pipapopsps. 


Comment on how this relates to the two factorizations: 


(2) (3) (5), 
(30) = (1+ vV—29) (1 — V—29). 


— 
ary 
S 

~~ 

| 


16. Find all ideals in Z[,/—29] containing the element 30. 


I] 


Geometric Methods 
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Lattices 


At this stage we take a radical new view of the theory, turning from purely 
algebraic methods to techniques inspired by geometry. This approach re- 
quires a different attitude of mind from the reader, in which formal ideas 
are built on a visual foundation. We begin with basic properties of lattices: 
subsets of R” which in some sense generalize the way Z is embedded in R. 
We characterize lattices topologically as the discrete subgroups of R”. We 
introduce the fundamental domain and quotient torus corresponding to a 
lattice and relate the two concepts. Finally we define a concept of volume 
for subsets of the quotient torus. 


6.1 Lattices 


Let €1,...,€m be a linearly independent set of vectors in R” (so that 
m <n). The additive subgroup of (R”,+) generated by e1,...,€m is 
called a lattice of dimension m, generated by €1,... ,@m. Figure 6.1 shows 


a lattice of dimension 2 in R?, generated by (1,2) and (2,—1). (Do not 
confuse this with any other uses of the word ‘lattice’ in algebra.) Obviously, 
as regards the group-theoretic structure, a lattice of dimension m is a free 
abelian group of rank m, so we can apply the terminology and theory of 
free abelian groups to lattices. 

We shall give a topological characterization of lattices. Let R” be 
equipped with the usual metric (& la Pythagoras), where ||z — y|| denotes 
the distance between x and y, and denote the (closed) ball centre x radius 
r by B,|z]. Recall that a subset X C R” is bounded if X C B,[0] for some 
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Figure 6.1. The lattice in R? generated by e1 = (1,2) and eg = (2,—1). 


r. We say that a subset of R” is discrete if and only if it intersects every 
B,[0] in a finite set. 


Theorem 6.1. An additive subgroup of R” is a lattice if and only if it is 
discrete. 


Proof: Suppose L is a lattice. By passing to the subspace spanned by L 
we may assume L has dimension n. Let L be generated by e1,... , €n; then 
these vectors form a basis for the space R”. Every v € R” has a unique 
representation 


v=dyer+...+Anen (4 €R). 
Define f : R” + R” by 
fei ec ate) = Oi OD: 
Then f(B,[0]) is bounded, say 
If (v)|| < & for v € B,[0). 
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If > axe; € B,[0] (a; € Z), then certainly ||(ai,... ,@n)|| < &. This implies 
lai] < ||(ai,-.- ,@n)|| < &. (6.1) 


The number of integer solutions of (6.1) is finite and so LN B,|[0], being a 
subset of the solutions of (6.1), is also finite, and L is discrete. 
Conversely, let G be a discrete subgroup of R”. We prove by induction 
on n that G is a lattice. Let {gi,...,9m} be a maximal linearly indepen- 
dent subset of G, let V be the subspace spanned by {g1,-..- ,9m—i}, and 
let Go = GNV. Then Go is discrete so, by induction, is a lattice. Hence 


there exist linearly independent elements hi,... , hm: generating Go. Since 
the elements g1,... ,9m—1 © Go we have m’ = m — 1, and we can replace 
{91;---59m—1} by {hi,... ,hm_—1i}, or equivalently assume that every ele- 
ment of Go is a Z-linear combination of gi,... ,gm—1- Let T be the subset 


of all x € G of the form 
Z=a191+...+@m9m 
with a; € R, such that 


O<aj<1 (¢=1,...,m—1) 
0<a, <1. 


Then T is bounded, hence finite since G is discrete, and we may therefore 
choose xz’ € T with smallest non-zero coefficient an, say 


gz’ =bigi t+... t+ bmgm- 


Certainly {91,... ,9m—1,2'} is linearly independent. Now starting with 
any vector g € G we can select integer coefficients c; so that 


9 =9 —Cm2' — 191 — --- — Cm—19m—1 


lies in T, and the coefficient of gm in g’ is less than b,, but non-negative. 
By choice of z’ this coefficient must be zero, so g’ € Gp. Hence 
{x',91,--- »9m—i} generates G, and G is a lattice. oO 


If L is a lattice generated by {e1,... ,@n} we define the fundamental 
domain T to consist of all elements }> a;e; (a; € R) for which 


0<a, <1. 


Note that this depends on the choice of generators. 


Lemma 6.2. Each element of R” lies in exactly one of the sets T +1 for 
leL. 
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Figure 6.2. A fundamental domain T for the lattice of Figure 1, and a translate 
T +1. Dotted lines indicate omission of boundaries. 


Proof: Chop off the integer parts of the coefficients. O 


Figure 6.2 illustrates the concept of a fundamental domain, and the 
result of Lemma 6.2, for the lattice of Figure 6.1. 


6.2 The Quotient Torus 


Let L be a lattice in R”, and assume to start that Z has dimension n. We 
shall study the quotient group R"/L. 

Let S denote the set of all complex numbers of modulus 1. Under 
multiplication S is a group, called for obvious reasons the circle group. 


Lemma 6.3. The quotient group R/Z is isomorphic to the circle group 8. 


Proof: Define a map ¢:R — S by 


o(a) == e2niz ; 
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Figure 6.3. The Cartesian product of two circles is a torus. 


Then ¢ is a surjective homomorphism with kernel Z, and the lemma follows. 
O 


Next let T” denote the direct product of n copies of S, and call this 
the n-dimensional torus. For instance, T? = S x S is the usual torus (with 


a group structure) as sketched in Figure 6.3. 


Theorem 6.4. If L is an n-dimensional lattice in R” then R”/L is iso- 
morphic to the n-dimensional torus T”. 


Proof: Let {e1,...,én} be generators for L. Then {e1,... , én} is a basis 
for R". Define ¢: R” > T” by 


(aye, +... + dnen) = (e27*41,...  e2#4n), 
Then ¢ is a surjective homorphism, and the kernel of ¢ is L. oO 


Lemma 6.5. The map ¢ defined above, when restricted to the fundamental 
domain T, yields a bijection T > T”. O 


Geometrically, T” is obtained by ‘glueing’ (i.e. identifying) opposite 
faces of the closure of the fundamental domain, as in Figure 6.4. 


Figure 6.4. The quotient of Euclidean space by a lattice of the same dimension is 
a torus, obtained by identifying opposite edges of a fundamental domain. 
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TxR 


Figure 6.5. The quotient of Euclidean space by a lattice of smaller dimension is 
a cylinder. 


If the dimension of L is less than n, we have a similar result: 


Theorem 6.6. Let L be an m-dimensional lattice in R". Then R"/L is 
isomorphic to T™ x R™-™. 


Proof: Let V be the subspace spanned by L, and choose a complement 
W so that R” = V@W. Then L C V, V/L = T™ by Theorem 6.4, 
W =R”™., and the result follows. O 


For example, R?/Z ~ T! x R, which geometrically is a cylinder as in 
Figure 6.5. 

The volume v(X) of a subset X C R” is defined in the usual way: for 
precision we take it to be the value of the multiple integral 


[etiam 
».¢ 


where (21,... , 2%») are coordinates. Of course the volume exists only when 
the integral does. 

Let L C R” be a lattice of dimension n, so that R"/L YT”. Let T be 
a fundamental domain of L. We have noted the existence of a bijection 


¢:T>T". 
For any subset X of T” we define the volume v(X) by 
v(X) = o(¢7"(X)) 


which exists if and only if ¢-1(X) has a volume in R”. 
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Let vy : R” > T™ be the natural homomorphism with kernel L. It 
is intuitively clear that v is ‘locally volume-preserving’, that is, for each 
xz € R” there exists a ball B,[z] such that for all subsets X C B,[z] for 
which v(X) exists we have 


o(X) = v(v(X)) 


It is also intuitively clear that if an injective map is locally volume-preserving 
then it is volume-preserving. We prove a result which combines these two 
intuitive ideas: 


Theorem 6.7. If X is a bounded subset of R" and v(X) exists, and if 
u(v(X)) # v(X), then v|x is not injective. 


Proof: Assume v|x is injective. Now X, being bounded, intersects only 
a finite number of the sets T+ 1, for J’ a fundamental domain and | € L. 
Put 


X= XN(T +1). 
Then we have 
X =X), U...UX,. 
For each 1; define 
Y,, = Xi, — hi, 


so that Y;, C T. We claim that the Y;, are disjoint. Since v(x — 1;) = v(x) 
for all z € R” this follows from the assumed injectivity of v. Now 


v(Xi,) = o(%,) 
for all 7. Also 
W(Xi,) = $(¥i,) 
where ¢ is the bijection T + T”. Now we compute: 
vo(v(X)) = o(v(UXi,)) 
= v(UY;,) 
= So o(%,) by disjointness 
5 o(Xi,) 
= o(X), 


which is a contradiction. O 


The idea of the proof can be summed up pictorially by Figure 6.6. 
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Win 2 


Figure 6.6. Proof of Theorem 6.7: if a locally volume-preserving map does not 
preserve volume globally, then it cannot be injective. 


6.3 Exercises 


1. Let L be a lattice in R? with L C Z?. Prove that the volume of a 


fundamental domain T is equal to the number of points of Z? lying 
in T. 


. Generalize the previous exercise to R” and link this to Lemma 9.3 


by using Theorem 1.17. 


. Sketch the lattices in R? generated by: 


(a) (0, 1) and (1, 0). 

(b) (—1,2) and (2, 2). 

(c) (1,1) and (2,3). 

(d) (—2, —7) and (4, —3). 
(e) 
(f) 


1,20) and (1, —20). 


( 
(1, 7) and (7,1). 


. Sketch fundamental domains for these lattices. 


. Hence show that the fundamental domain of a lattice is not uniquely 


determined until we specify a set of generators. 


6. Verify that nonetheless the volume of a fundamental domain of a 


given lattice is independent of the set of generators chosen. 
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7. Find two different fundamental domains for the lattice in R*® gener- 
ated by (0,0,1), (0,2,0), (1,1,1). Show by direct calculation that 
they have the same volume. Can you prove this geometrically by 
dissecting the fundamental domains into mutually congruent pieces? 
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Minkowski’s Theorem 


The aim of this chapter is to prove a marvellous theorem, due to Minkowski 
in 1896. This asserts the existence within a suitable set X of a non-zero 
point of a lattice L, provided the volume of X is sufficiently large relative to 
that of a fundamental domain of L. The idea behind the proof is deceptive 
in its simplicity: it is that X cannot be squashed into a space whose volume 
is less than that of X, unless X is allowed to overlap itself. Minkowski 
discovered that this essentially trivial observation has many non-trivial 
and important consequences, and used it as a foundation for an extensive 
theory of the ‘geometry of numbers’. As immediate and accessible instances 
of its application we prove the two- and four-squares theorems of classical 
number theory. 


7.1 Minkowski’s Theorem 


A subset X C R” is convex if whenever z, y € X then all points on the 
straight line segment joining z to y also lie in X. In algebraic terms, X is 
convex if, whenever z, y € X, the point 


Az +(1—A)y 


belongs to X for all real A, O<A< 1. 

For example a circle, a square, an ellipse, or a triangle is convex in 
R?, but an annulus or crescent is not (Figure 7.1). A subset X C R” is 
(centrally) symmetric if s € X implies —z € X. Geometrically this means 
that X is invariant under reflection in the origin. Of the sets in Figure 7.1, 
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OU € 


Figure 7.1. Convex and non-convex sets. The circular disc, square, ellipse, and 
triangle are convex; the annulus and crescent are not. The circle, square, ellipse, 
and annulus are centrally symmetric about *; the triangle and crescent are not. 


assuming the origin to be at the positions marked with an asterisk, the 
circle, square, ellipse, and annulus are symmetric, but the triangle and 
crescent are not. 

We may now state Minkowski’s theorem. 


Theorem 7.1. (Minkowski’s Theorem.) Let L be an n-dimensional lattice 
in R” with fundamental domain T, and let X be a bounded symmetric 
conver subset of R". If 


u(X) > 2"v(T) 


then X contains a non-zero point of L. 


Proof: Double the size of L to obtain a lattice 2L with fundamental 
domain 2T of volume 2"v(T). Consider the torus 


T” = R”/2L. 
By definition, 
v(T”) = v(2T) = 2"v(T). 


Now the natural map v : R” — T” cannot preserve the volume of X, since 
this is strictly larger than v(T”): since v(X) C T” we have 


u(v(X)) < v(T”) = 2"u(T) < v(X). 
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It follows by Theorem 6.7 that v|x is not injective. Hence there exist 
21 # Lo, £1, £2 € X, such that 


u(z1) = v(x2), 


or equivalently 
£1 — 2 € 2L. (7.1) 


But v2 € X, so —%_ € X by symmetry; and now by convexity 
3(#1) + 512(—a2) € X, 


that is, 
3(x1 — %2) € X. 


But by Equation (7.1), 


3(a1 = Le) eL. 


Figure 7.2. Proof of Minkowski’s theorem. Expand the original lattice (0) to 
double the size (@) and form the quotient torus. By computing volumes, the 
natural quotient map is not injective when restricted to the given convex set. 
From point x1 and x2 with the same image we may construct a non-zero lattice 
point $(x1 — 22). 
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Hence 
OF 3 (a4 — £2) Exnt, 


as required. O 


The geometrical reasoning is illustrated in Figure 7.2. The decisive step 
in the proof is that since T” has smaller volume than X it is impossible to 
squash X into T” without overlap: the ancient platitude of quarts and pint 
pots. That such olde-worlde wisdom becomes, in the hands of Minkowski, a 
weapon of devastating power, was the wonder of the 19th century and a les- 
son for the 20th. We will unleash this power at several crucial stages in the 
forthcoming battle. (Note that our original Thespian metaphor has been 
abandoned in favour of a military one, reinforcing the change of viewpoint 
from that of the algebraic voyeur to that of the geometric participant.) As 
a more immediate affirmation, we now give two traditional applications to 
number theory: the ‘two-squares’ and ‘four-squares’ theorems. 


7.2 The Two-Squares Theorem 
We start by proving: 


Theorem 7.2. If p is prime of the form 4k +1 then p is a sum of two 
integer squares. 


Proof: The multiplicative group G of the field Z, is cyclic (Garling [28] 
Corollary 1 to Theorem 12.3, p. 105; Stewart [71], p. 171) and has order 
p —1=4k. It therefore contains an element u of order 4. Then u? = — 
(mod p) since —1 is the only element of order 2 in G. 

Let L C Z? be the lattice in R? consisting of all pairs (a,b) (a,b € Z) 
such that 


b=ua (mod p). 


This is a subgroup of Z? of index p (an easy verification left to the 
reader) so the volume of a fundamental domain for L is p. By Minkowski’s 
theorem any circle, centre the origin, of radius r, which has area 


ar? > Ap 


contains a non-zero point of L. This is the case for r? = 3p/2. So there 
exists a point (a,b) € L, not the origin, for which 


040746? <r? = 3p/2 < 2p. 
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But modulo p we have 
a? + b? =a? 4+ ua? =0. 


Hence a?+b*, being a multiple of p strictly between 0 and 2p, must equal p. 
O 


The reader should draw the lattice L and the relevant circle in a few 
cases (p = 5,13,17) and check that the relevant lattice point exists and 
provides suitable a, b. 

Theorem 7.2 goes back to Fermat, who stated it in a letter to Mersenne 
in 1640. He sent a sketch proof to Pierre de Carcavi in 1659. Euler gave a 
complete proof in 1754. 


7.3 The Four-Squares Theorem 
Refining this argument leads to another famous theorem: 
Theorem 7.3. Every positive integer is a sum of four integer squares. 


Proof: We prove the theorem for primes p, and then extend the result to 
all integers. Now 


2=17+17+0° +0 
so we may suppose p is odd. We claim that the congruence 
w+v%+1=0 (mod p) 


has a solution u,v € Z. This is because u? takes exactly (p + 1)/2 distinct 
values as u runs through 0,... ,p—1; and —1— v” also takes on (p+ 1)/2 
values: for the congruence to have no solution all these values, p + 1 in 
total, will be distinct: then we have p+ 1 < p which is absurd. 

For such a choice of u, v consider the lattice L C Z* consisting of 
(a,b, c,d) such that 


c=ua+vb, d=ub—va (mod p). 


Then L has index p? in Z* (another easy computation) so the volume of a 
fundamental domain is p*. Now a 4-dimensional sphere, centre the origin, 
radius r, has volume id 

nr? [2 


and we choose r to make this greater than 16p; say r? = 1.9p. 
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Then there exists a lattice point 0 # (a, b, c,d) in this 4-sphere, and so 


040740? 4+c%74+d? <r*7=1.9p < Qp. 


Modulo p, it is easy to verify that a? + b? + c? + d* = 0, hence as before 
must equal p. 

To deal with an arbitrary integer n, it suffices to factorize n into primes 
and then use the identity 


(a? + b? + c? + d?)(A? + B? + C? + D?) 
= (aA — bB —cC — dD)? + (aB+bA+cD — dC)? 


+(aC —bD+cA+dB)? + (aD + bC — cB +dA)?. 


O 


Theorem 7.3 also goes back to Fermat. Euler spent 40 years trying to 
prove it, and Lagrange succeeded in 1770. 


7.4 Exercises 


1. 


aoa a _; Ww 


Which of the following solids are convex? Sphere, pyramid, icosahe- 
dron, cube, torus, ellipsoid, parallelepiped. 


. How many different convex solids can be made by joining n unit cubes 


face to face, so that their vertices coincide, for n = 1, 2,3, 4,5, 6; 
counting two solids as different if and only if they cannot be mapped 
to each other by rigid motions? What is the result for general n? 


. Verify the two-squares theorem on all primes less than 200. 
. Verify the four-squares theorem on all integers less than 100. 
. Prove that not every integer is a sum of three squares. 


. Prove that the number p(n) of pairs of integers (x, y) with z7+y? <n 


satisfies u(n)/n > 7 as n — oo. 
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Geometric Representation of 
Algebraic Numbers 


The purpose of this chapter is to develop a method of embedding a number 
field K in a real vector space of dimension equal to the degree of K, in such 
a way that ideals in K map to lattices in this vector space. This opens the 
way to applications of Minkowski’s theorem. The embedding is defined in 
terms of the monomorphisms K — C, and we have to distinguish between 
those which map K into R and those which do not. 


8.1 The Space L* 


Let K = Q(6@) be a number field of degree n, where 6 is an algebraic integer. 
Let o1,... , On be the set of all monomorphisms K — C (see Theorem 2.4). 
If o;(K) C R, which happens if and only if o;(6) € R, we say that oj, is 
real, otherwise o; is complex. As usual denote complex conjugation by bars 
and define 


5;(a) = o;(a). 


Since complex conjugation is an automorphism of C it follows that @; is a 
monomorphism K — C, so equals o; for some j. Now o; = 6; if and only if 
o; is real, and G; — 0;, so the complex monomorphisms come in conjugate 
pairs. Hence 

n=s+2t 
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where s is the number of real monomorphisms and 2¢ the number of com- 
plex. We standardize the numeration in such a way that the system of all 
monomorphisms K -—> C is 


O1,-++ ,%s}Os41,Fs41,--+ ,Fs4t, Fst, 


where 01,... ,@,; are real and the rest complex. 


Further define 
L* = R’ x Ct, 


the set of all (s+ ¢)-tuples 
t= (z1, vee yp LgyLsetly-+- »Zs+4t) 


where 71,...,23 € Rand 2511,... ,2s44 € C. Then L* is a vector space 
over R, and a ring (with coordinatewise operations): in fact it is an R- 
algebra. As vector space over R. it has dimension s + 2¢ = n. 

For z € L** we define the norm 


N(x) = 21...26|te41|?...|ve+4|?. (8.1) 


(There is no confusion with other uses of the word ‘norm’, and we shall see 
why it is desirable to use this apparently overworked word in a moment.) 
The norm has two obvious properties: 

(a) N(z) is real for all z, 

(b) N(zy) = N(z)N(y). 

We define a map 


o:K>L® 
by 
a(a) = (01(a@),... ,05(&); O541(@),--- ,7s+4(@)) 
for a € K. Clearly 
o(a+ B) = o(a) + 0(8) (8.2) 


a (af) = o(a)o(B) 
for all a, 8 € K; so o is a ring homomorphism. If r is a rational number 
then 
o(ra) = ro(a) 


so o is a Q-algebra homomorphism. Furthermore, we have 


N(o(a)) = N(@) (8.3) 
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since the latter is defined to be 
01(@)...05(@)o541(@)F541(Q) ...05424(A)Fs+4(@) 


which equals the former. 
For example, let K = Q(@) where 0 € R satisfies 


6? —2=0. 


Then the conjugates of @ are 0,w0,w?@ where w is a complex cube root of 
unity. The monomorphisms K —> C are given by 


o1(0)=8, o2(0)=wO, G2(0)=w70. 


Hence s = t= 1. 
An element of K, say 


z=qtr0+s6? 
where q,7,s € Q, maps into L! according to 
o(a) = (q+70+ 867,q+rw0 + sw?6?). 


The kernel of o is an ideal of K since o is a ring homomorphism. Since 
K is a field this means that either o is identically zero or o is injective. 


But 
o(1) = (1,1,...,1) 40 
so o must be injective. Much stronger is the following result: 


Theorem 8.1. If a1,...,Qn is a basis for K over Q then o(ay),... ,7(An) 
are linearly independent over R. 


Proof: Linear independence over Q is immediate since o is injective, but 
we need more than this. Let 


ox (au) = a (k=1,...,8) 
os43(u) = yf? + iz? (j =1,... ,t) 


where ol), y?, zl) are real. Then 


1 tb), (l tb, 
a(ai) = (a... 0); y? + i2,... fy ) + iz), 
and it is sufficient to prove that the determinant 


oO 4M 2. got 


2) yo 2 yf 200 
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is non-zero. Put 


a) Of ol) y + iz) y) _ iz) 
E = 


(n) 


a” cn gt) ys” + iz” y™ — iz} 


01(Q1)...05(Q1) os41(01)Fs41 (a1) 


01(Qn).--O5(Qn) Os41(Qn)Fs41(An) 
Then 
E? = Afay,... , a] 


by definition of the discriminant; and by Theorem 2.7, E? 4 0. Now 
elementary properties of determinants (column operations) yield 


E = (—2i)'D 
so that D # 0 as required. O 


Corollary 8.2. Q-linearly independent elements of K map under a to R- 
linearly independent elements of L**. O 


Corollary 8.3. If G is a finitely generated subgroup of (K,+) with Z- 
basis {a1,...,Qm} then the image of G in L*™ is a lattice with generators 
a(a1),...,0(Qm). oO 


The ‘geometric representation’ of K in L* defined by o, in combina- 
tion with Minkowski’s theorem, will provide the key to several of the deeper 
parts of our theory, in Chapters 9, 10, and Appendix B. For these appli- 
cations we shall need a notion of ‘distance’ on L**. Since L*, as a real 
vector space, is isomorphic to R*+?? the natural thing is to transfer the 
usual Euclidean metric from R°+?* to L**. This amounts to choosing a 
basis in L** and defining an inner product with respect to which this basis 
is orthonormal. The natural basis to pick is 


(1,0,... ,0;0,... ,0) 

(0,1,... ,0;0,... ,0) 

(0,0,... ,1;0,... ,0) 

(0,0,... ,0;1,0,... ,0) (8.4) 
(0,0,... ,0;4,0,... ,0) 


(0,0,... ,0;0,0,... ,1) 
(0,0,... ,0;0,0,... , 7). 
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With respect to this basis the element 
(x1, vee Esi Y1 + t21,.-. Yt + izt) 


of L** has coordinates 


Bina 2s, Y1,71,--- Yt» 2t)- 


Changing notation slightly, if we take 


rc = (x1, ene »Ls+2t) 
SS a ae Ray) 
with respect to the new coordinates (4), then the inner product is defined 
by 
(x, 2’) = rir, +... + Dep242 54 24- 


The length of a vector x is then 


Iz|| = Vz, 2) 


and the distance between x and 2’ is ||z — 2’ ||. 
Referred to our original mixture of real and complex coordinates we 
have, for 


ct. = ieee 1%s3 V1 + i2z,... Ye + 1%), 


Iz|] = (a? +...4¢02+y2 + 22 4+...4+y2 + 2). 


8.2 Exercises 


1. Find the monomorphisms o; : K — C for the following fields and 
determine the number s of the o; satisfying o;(K) C R, and the 
number ¢ of distinct conjugate pairs o;,0; such that 6; = 0;: 


(i) Q(v5) 
(ii) Q(V—5) 

(iii) Q(75) 

(iv) Q(C) where ¢ = e?7#/7 
( 


v) Q(¢) where ¢ = e?**/P for a rational prime p. 
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. For K = Q(v 4d) where d is a squarefree integer, calculate 0 : K > 


L*, distinguishing the cases d < 0, d > 0. Compute N(z) for z € L* 
and by direct calculation verify 


N(a) =N(o(a)) (ae K). 


. Let K = Q(6) where the algebraic integer 6 has minimum polynomial 


f. If f factorizes over R into irreducibles as 


f(t) = i (t).-- gq(t)hi(t) .. - hy (t) 


where g; is linear and h; quadratic, prove that g = s and r = ¢ in the 
notation of the chapter for s, t. 


. Let K = Q(6) where 6 € R and 6? = 3. What is the map o in this 


case? Pick a basis for K and verify Theorem 8.1 for it. 


. Find a map from R? to itself under which Q-linearly independent sets 


map to Q-linearly independent sets, but some R-linearly independent 
set does not map to an R-linearly independent set. 


. If K = Q(6) where 6 € R and 6? = 3, verify Corollary 8.3 for the 


additive subgroup of K generated by 1+ 6 and 6? — 2. 


9 


Class~Group and Class~Number 


We now use the geometric ideas that we have developed to build further 
insight into the property of unique factorization. We already know that 
the factorization of elements into primes is unique if and only if every ideal 
in the ring of integers is principal. The plan now is to refine this statement 
to give a quantitative measure of how non-unique factorization can be. 
We do this by using the fractional ideals introduced in Chapter 5. The 
class-group of a number field is defined to be the quotient of the group of 
fractional ideals by the (normal) subgroup of principal fractional ideals; the 
class-number is the order of this group. This gives the required measure: 
factorization in a ring of integers is unique if and only if the corresponding 
class-number is 1. If the class-number is greater than 1, factorization is 
non-unique. Intuitively, the larger the class-number, the more complicated 
the possibilities for non-uniqueness become. 

It turns out that the class-number is always finite, an important fact 
which we prove using Minkowski’s theorem. Simple group-theoretic con- 
siderations then yield useful conditions for an ideal to be principal. These 
conditions lead to a proof that every ideal becomes principal in a suitable 
extension field, which is one formulation of the basic idea of Kummer’s 
‘ideal numbers’ within the ideal-theoretic framework. 


The importance of the class-number can only be hinted at here. It is 
crucial in the proof of Kummer’s special case of Fermat’s Last Theorem 
in Chapter 11. Many deep and delicate results in the theory of numbers 
are related to arithmetic properties of the class-number, or to algebraic 
properties of the class-group. 
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9.1 The Class-Group 


As usual let 0 be the ring of integers of a number field K of degree n. 
Theorem 5.21 tells us that prime factorization in D is unique if and only if 
every ideal of D is principal. Our aim here is to find a way of measuring 
how far prime factorization fails to be unique in the case where 0 contains 
non-principal ideals, or equivalently how far away the ideals of D are from 
being principal. 

To this end we use the group of fractional ideals defined in Chapter 5. 
Say that a fractional ideal of D is principal if it is of the form c~ta where 
a is a principal ideal of D. Let F be the group of fractional ideals under 
multiplication. It is easy to check that the set P of principal fractional 
ideals is a subgroup of F. We define the class-group of D to be the quotient 


rou 
re H=F/P 


The class-number h = h(D) is defined to be the order of H. 

Since each of F,P is an infinite group we have no immediate way of 
deciding whether or not h is finite. In fact it is; and our subsequent ef- 
forts will be devoted to a proof of this deep and important fact. First, 
however, we shall reformulate the definition of the class-group in a manner 
independent of fractional ideals. 

Let us say that two fractional ideals are equivalent if they belong to the 
same coset of P in Ff, or in other words if they map to the same element 
of F/P. If a and b are fractional ideals we write 


a~b 


if a and b are equivalent, and use a] 
a 


to denote the equivalence class of a. 
The class-group H is the set of these equivalence classes. 
If a is a fractional ideal then a = c~1b where c € O and 6 is an ideal. 


Hence 
b=ca=(c)a 


and since (c) € P this means that a ~ b. In other words, every equivalence 
class contains an ideal. 

Now let r and 9 be equivalent ideals. Then r = cy where c is a principal 
fractional ideal, say c = d~+e for d € , e a principal ideal. Therefore 


t (d) = pe. 
Conversely if rb = ne for b, e principal ideals, then r ~ p. 
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This allows us to describe H as follows: take the set F of all ideals, 
and define upon it a relation ~ by x ~ p if and only if there exist principal 
ideals b, e with rb = ne. Then H is the set of equivalence classes [rx], with 
group operation defined by 


[z][] = [zo]. 


It is for this reason that H is called the class-group. 
The significance of the class-group is that it captures the extent to which 
factorization is not unique. In particular we have 


Theorem 9.1. Factorization in D is unique if and only if the class-group 
H. has order 1, or equivalently the class-number h = 1. 


Proof: Factorization is unique if and only if every ideal of D is principal 

(Theorem 5.21), which in turn is true if and only if every fractional ideal is 

principal, which is equivalent to F = P, which is equivalent to |H| = h = 1. 
O 


The rest of this chapter proves that h is finite, and deduces a few useful 
consequences. In the next chapter we develop some methods whereby h, 
and the structure of #1, may be computed: such methods are an obvious 
necessity for applications of the class-group in particular cases. 


9.2 An Existence Theorem 

The finiteness of A rests on an application of Minkowski’s theorem to the 
space L*. It is, in fact, possible to give a more elementary proof that h is 
finite, see Lang [43], for example, but Minkowski’s theorem gives a better 
bound, and is in any case needed elsewhere. In this section we state and 


prove the relevant result, leaving the finiteness theorem to the next section. 


Lemma 9.2. If M is a lattice in L®™ of dimension s+2t having fundamental 
domain of volume V, and if c1,...,Cs44 are positive real numbers whose 


product 
A\t 
C,...Cs4t > (=) V 


then there exists in M a non-zero element 


© = (£1,.-. , £544) 
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such that 


|ai|<c1,...,|28| < €s3 
\ee41l° < Cs41,--- s|ze4el" < Csit- 


Proof; Let X be the set of all points z € L** for which the conclusion 
holds. We compute 


C1 Cg 
(x) = fo dn... [ aesx ff iat 
C1 —Cs yi te? <cs41 
x / / dy:dzz 
ye+e2 <co+t 


=> 2cy - 2¢g...2Cg + WCg41---Wls4t 


= Qn cy +e Cott. 


Now X is a cartesian product of line segments and circular discs, so X is 
bounded, symmetric, and convex. Minkowski’s theorem yields the required 
result provided 


2 re 21 eg > OPV, 


4 t 
o1.-.0n44 > (4) V. Oo 


Let K be a number field of degree n = s + 2¢ as usual, with ring of 
integers 0; and let a be an ideal of O. Then (a, +) is a free abelian group 
of rank n (Theorem 2.16) so by Corollary 8.3 its image o(a) in L® is a 
lattice of dimension n. To use the above lemma in this situation we must 
know the volume of a fundamental domain for o(a). First we note: 


that is 


Lemma 9.3. Let L be an n-dimensional lattice in R” with basis {e1,... , én}. 
Suppose 


ei = (a1, see Oni): 
Then the volume of the fundamental domain T of L defined by this basis is 


u(T) = | det a;,|. 
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o(T) = / tone oe 
T 
Define new variables by 


y= ) Qizyj- 
j 


The Jacobian of this transformation is equal to det a;;, and T is the set of 
points }> a.j;y; with 0 < y; < 1. By the transformation formula for multiple 
integrals (Apostol [2] p. 271) we have 


uo(T) | |det a;;| dy1 ...dyn 
T 


1 1 
jactas| f dys... f dyn, 
0 0 


|det ai; | . O 


Given a lattice L there exist many different Z-bases for L, and hence 
many distinct fundamental domains. However, since distinct Z-bases are 
related by a unimodular matrix, it follows from Lemma 9.3 that the volumes 
of these distinct fundamental domains are all equal. 


Theorem 9.4. Let K be a number field of degree n = s + 2t as usual, with 
ring of integers D, and let 0 #4 be an ideal of D. Then the volume of a 
fundamental domain for o(a) in L* is equal to 


2-*N(a) V/A] 
where A is the discriminant of K. 


Proof: Let {a1,...,Q@,} be a Z-basis for a. Then, in the notation of 
Theorem 8.1, a Z-basis for o(a) in L* is 


1 1 1 1 1 1 
(af ee os yf ), 2f are yf ), af ), 


(af, ... 28, gh, 26”), yf, 26). 


Hence by Lemma 9.3, if T is a fundamental domain for o(a), we have 


o(T) =|D|, 
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where D is as in Theorem 8.1. Using the notation of that theorem we have 


D = (-2i)*E 
so that 
|D| =2-*|E|. 
Now E? = Afay,... , a] and 
Alan,... ;@n]|'/? 
nia) = [Ales 
by Theorem 5.9, whence the result. O 


We may now apply Theorem 9.4 and Lemma 9.2 to yield the important: 


Theorem 9.5. If a# 0 is an ideal of D then a contains an integer a with 


nia s (2) Noval 


where A is the discriminant of K. 


Proof: For a fixed but arbitrary « > 0 choose positive real numbers 
C1y+++ »Cgtt with 


2\t 
C1. +. Cs4¢ = (2) N(a)V/|A] + «. 


By Lemma 9.2 and Theorem 9.4 it follows that there exists 0 4 a € a such 
that 
lo1(a)| <c1,... , |os(a)| < es, 


los41(a)|? < Copa,+++ s|os4e(@)|” < cays. 
Multiplying all these inequalities together we obtain 


o\t 
IN(a)| < c1...CsCs41--- Catt = (2) N (a) V/|Al] +. 


Since a lattice is discrete, it follows that the set A. of such a is finite. Also 
A, #9, so that A=. A. # 0. It we pick a € A then 


IN(oyi < (2) ste) VIB 0 
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Corollary 9.6. Every non-zero ideal a of D equivalent to an ideal whose 


norm is < (2/m)*,/JAI. 


Proof: The class of fractional ideals equivalent to a~! contains an ideal c, 
so ac~ 9. We can use Theorem 9.5 to find an integer -y € c such that 


a\t 
Nl s (2) NOVI. 
Since c|-y we have 


(y) = eb 
for some ideal b. Since N(b)N(c) = N(bc) = N((y)) = |N(7)| we have 


N(b) < (2) va. 


We claim that 6 ~ a. This is clear since c ~~ a! and bw~c"!. Oo 


An explicit computation. lf K = Q(./—5), then D = Z[/—5] does 
not have unique factorization, so h > 1. Because the monomorphisms 
o, : K + C are 01,02 where o, # o2 and G = o2, we have t = 1. The 
discriminant A of K is A = —20. Hence 


() vin 2 cam 


Every ideal of 9 is then equivalent to an ideal of norm less than 2.85, which 
means a norm of 1 or 2. An ideal of norm 1 is the whole ring 0, hence 
principal. An ideal a of norm 2 satisfies a|2 by Theorem 5.14 (b), so ais a 
factor of (2). But 


= (2,1+ ¥—5)" 


where (2,1-+ \/—5) is prime and has norm 2. So (2,1+ /—5) is the only 
ideal of norm 2. Hence every ideal of O is equivalent to 0 or (2,1+ /—5) 
which are themselves inequivalent (since (2,1 + /—5) is not principal), 
proving h = 2. 


9.3 Finiteness of the Class~Group 


Theorem 9.7. The class-group of a number field is a finite abelian group. 
The class-number h is finite. 
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Proof: Let K be a number field, of discriminant A, and degree n = s+2t as 
usual. We know that the class-group H = F/P is abelian, so it remains to 
prove H finite: this is true if and only if the number of distinct equivalence 
classes of fractional ideals is finite. Let [c] be such an equivalence class. 
Then [c] contains an ideal a, and by Corollary 9.6, a is equivalent to an 
ideal b with N(b) < (2/7)*,/AJ. Since only finitely many ideals have a 
given norm (Theorem 5.17 (c)) there are only finitely many choices for b. 
Since [c] = [a] = [6] (because c ~ a ~ b) it follows that there are only 
finitely many equivalence classes [c], whence H is a finite group and h = |H| 
is finite. O 


From simple group-theoretic facts we obtain the useful: 


Proposition 9.8. Let K be a number field of class-number h, and a an ideal 
of the ring of integers DO. Then 

(a) a’ is principal, 

(b) If q is prime to h and a! is principal, then a is principal. 


Proof: Since h = |H| we have [a]* = [9] for all [a] € H, because [9] 
is the identity element of H. Hence [a*] = [a]* = [0], so a* ~ 9, so 
a’ is principal. This proves (a). For (b) choose u and v € Z such that 
uh + vq = 1. Then [a]? = [O], so we have 


le] = [ates 


= ((@l*)" (iale)” 
oto” 
i) 


hence again a is principal. O 


9.4 How to Make an Ideal Principal 


This section and the next are not required elsewhere in the book and may 
be omitted if so desired. 

Given an ideal a in the ring © of integers of a number field K, we 
already know that a has at most two generators 


a = (a, B) (a, 8 € 9). 


What we demonstrate in this section is that we can find an extension 
number field E D> K with integers D’ such that the extended ideal 9’a in 
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’ is principal. As a standard piece of notation we shall retain the symbols 
(a), (a, B) to denote the ideals in D generated by a and by a, B € O, whilst 
writing the ideal in 0’ generated by S C 9’ as D'S. For example 0’x will 
denote the principal ideal in 0’ generated by x € 0’. 


Lemma 9.9. If 51,52 are subsets of DO’, then 


9 ($152) = (9'S;)('S2). 


Proof: Trivial (remembering 1 € 9’). Oo 
The central result is then: 


Theorem 9.10. Let K be a number field, a an ideal in the ring of integers 
D of K. Then there exists an algebraic integer x such that if 0’ is the ring 
of integers of K(k), then 

(i) O'n = O'a 

(ii) (O'’n) ND =a 

(iti) If B is the ring of all algebraic integers, then (Bk) K =a. 

(iv) If O"7y = O"a for any y € B, and any ring 0" of integers, then 

y= uK where u is a unit of B. 


Proof: By Proposition 9.8, a” is principal, say a* = (w). Let h = w/' € 
B, and consider E = K(x). Let 0’ = BN E be the ring of integers in E, 
then clearly x € 0’. Since a” = (w), it follows, using Lemma 9.9, that 


(0'a)* = 0'(a") = Ow = 0'n* = (0'r)". 
Uniqueness of factorization of ideals in 0’ easily yields 


D'a=0'k, 
proving (i) 
Since (iii) implies (ii), we now consider (iii). The inclusion a C BRN K 
is straightforward. Conversely, suppose y C Bk K, then 


y=A« (A€B) 
and we are required to show y € a. First note that, since y € K, x € FE, 
we have X= yx 1€ FE, soX€C ENB=9". 
This gives 


a = Neh = hw (ye K,AECD',w ED) 
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soy" € B, and by Theorem 2.10, y € B. Thus y € BNK = 9. Considering 
the equation y* = \*w again, we find 


MN =7wt eK 

so A’ € KN B=D. Thus we finish up with 

yh = hw (7, A",w ED). 
Taking ideals in D we get 

(yy = (AP) ww) = (A*) al. 
Unique factorization in D implies (A*) = 6” for some ideal 6, so 

(y)" = bho” 
and unique factorization once more implies 
(7) = ba, 


whence ¥ € a, as required. 
The proof of (iv) is found by noting that by Theorem 5.20, a = (a, 8) 
for a, 3 € O; and substituting in (iv) gives 


O"y = 9" (a, 8) . 


Thus 
y=Aat pp 


where A, up € 0”, so certainly A, w € B. From (i), a, 8 € O’k, so 
a=nk, B=C«n (n,¢€0'CB). 


Hence -y = Ank + uC and «|7 in B. Interchanging the roles of -y, x proves 
O 


(iv). 


Theorem 9.10 can be improved, for as it stands the extension ring 9’ in 
which 9'a is principal depends on a. We can actually find a single extension 
ring in which the extension of every ideal is principal. This depends on the 
following lemma and the finiteness of the class-number: 


Lemma 9.11. If a,6 are equivalent ideals in the ring D of integers of a 
number field and D'a is principal, then so is D'b. 
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Proof: By the definition of equivalence, there exist principal ideals 0, ¢ of 
OD such that ad = be. Hence 


(0'a)(9'0) = (9'b)(D'e) 


where now 9’a, 0'0, O’e are all principal. Since the set P of principal 
fractional ideals of D’ is a group, 0’b is a principal fractional ideal which 
is also an ideal, so 0'b is a principal ideal. oO 


Theorem 9.12. Let K be a number field with integers Dx, then there exists 
a number field L > K with ring Dy of integers such that for every ideal a 
in DK we have 

(i) Oza is a principal ideal, 

(ii) (Opa)N OK =a. 


Proof: Since h is finite, select a representative set of ideals a,,... ,a;,, one 
from each class and choose algebraic integers K1,... ,K;, such that 9,a; is 
principal where 0; is the ring of integers of K(«;). Let L = K(k1,... ,n), 
then its ring 07 of integers contains all the 0;. Hence each ideal D0; is 
principal in 07. Since every ideal a in O is equivalent to some a;, then 
OD a is principal by Lemma 9.11, say 


Ora=Ozra (a E B). 


This proves (i) 
Clearly a C (O;,a)N Ox. For the converse inclusion, Theorem 9.10 (iv) 
implies a = uk where uw is a unit in B. Hence 


(Ora)NOKn = (Ora)NOK 
(Ba) NK 
= (Ba)nkK 
=a 


IN 


by Theorem 9.10 (iii). O 


Until quite recently it was an open question, going back to 
Hilbert, whether every number field can be embedded in one with 
unique factorization. However, in 1964 Golod and Safarevié [32] showed 
that this is not always possible, citing the explicit example of 


Q(./(-3-5-7-11-13-17-19)). The proof is ingenious rather than hard, 


but it uses ideas we have not developed. 
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9.5 Unique Factorization of Elements 
in an Extension Ring 


The results of the last section can be translated from principal ideals back 
to elements to give the version of Kummer’s theory alluded to in the in- 
troduction to Chapter 5. There we considered examples of non-unique 
factorization such as 


10 = 2-5 = (54+ V15)(5 — V15) 


in the ring of integers of Q(V/15). Viewing this as an equation in Q(/3, V5), 
we saw that the factors could be further reduced as 


2 = (v5+¥3)(v5— V3) 


5 = V5V5 
5+V15 = vV5(v5+ V3) 


5—V15 = v5(v5— V3) 


and the two factorizations of 10 found above were just regroupings of the 
factors 


10 = V5v5(vV'5 + V3)(V5 — V3). 


We can now show that such a phenomenon occurs in all cases in rings of 
integers. 


Theorem 9.13. Suppose K is a number field with integers DO. Then there 
exists an extension field L D K with integers O71 such that every non-zero, 
non-unit a € Ox has a factorization 


Qa=D)\...Dr (pi € Ox) 


where the p; are non-units in O,, and the following property is satisfied: 
Given any factorization in Ox : 


Q@=)...Qs5 
where the a; are non-units in Ox, there exist integers 
l<ny<...<n,=r 


and a permutation m of {1,...,r} such that the following elements are 
associates in Or: 


@1, Dr(1) +++ Pax(ni) 


Qs, Px(ns—14+1) oo -Pa(ns)- 
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Remark. What this theorem says in plain language is that the factor- 
izations of elements into irreducibles in DO% may not be unique, but all 
factorizations of an element in DO, come from different groupings of asso- 
ciates of a single factorization in D;. In this sense elements in Ox have 
unique factorization into elements in Dy. 


Proof: There is a unique factorization of (a) into prime ideals in Ox, say 


(a) = pi... Pr. 


Since a is a non-unit, we have r > 1. Let DO; be a ring of integers as in 
Theorem 9.12 where every ideal of Ox extends to a principal ideal, and 
suppose 


Orpi=Orp, (pi € Ox). 


Then a = up,...p, where u is a unit in D7, and since r > 1, we may 
replace p; by up, € Ozp; to get a factorization of the form 


Q=DP)1...Dp. 
Given any factorization of elements 
Q=4)...Qs 
where the a; are non-units in Dx, then 
(a) = (a1)... (as) , 


where all the (a;) are proper ideals. Unique factorization in Ox gives us 
the integers n; and the permutation 7 such that 


(a1) = Px(1) see Pa(ni) 


(as) = Pa(ng_1+1) tee Pr(ne) 


Now take ideals in 0; generated by these ideals and the result follows. 0 


Example 9.14. From the explicit computation of Section 2, if 
K = Q(V-5), then h = 2 and a representative set of ideals is D, and 
(2,1 + /—5) where (2, 1+ v5) = (2). Hence we may take L = K(/2) = 
Q(/—5, V2). Theorem 9.13 tells us that every element of Z[,/—5] factor- 
izes uniquely in the integers of Q(./—5, V2). The case of the factorization 
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of the element 6 will be dealt with in Exercise 7 at the end of this chapter 
where we shall find 


6 = V2V2(2-V2 + 1V—10)(3-V2 — 1V—10). 


The fact that 4/2 + 4,/—10 really are integers may be dealt with by 
computing the explicit minimum polynomials of these elements over Q. 
Granted this, it is an easy matter to check that the two alternative factor- 
izations in Z[/—5]: 


6=2-3=(14+ V—5)(1— V-5) 


are just different groupings of the factors in the integers of Q(./—5, V2). 
(Do it!) The above example shows up a basic problem in factorizing ele- 
ments in an extension ring. We have not given a general method of com- 
puting the integers in a number field. To date we have only explicitly calcu- 
lated the integers in quadratic and cyclotomic fields and those calculations 
were not trivial. There is also another weakness of factorizing elements in 
an extension ring. The elements p; occurring in the factorization of a in 
Theorem 9.13 need not be irreducible. (For instance we might work in a 
slightly larger ring D7, containing ,/p;; the method of adjoining K = w!/* 
may very well add such roots.) However, the proof of Theorem 9.13 tells 
us that the factorization of the element a in OD; which gives the unique 
factorization properties is given by the factorization of the ideal (a) in the 
ring Ox. For this reason we may just as well stick to ideals in the original 
ring rather than embellish the situation by factorizing elements outside. 
Our computations in future will be concerned mainly with ideals—unique 
factorization of ideals proves so much easier to handle! 


9.6 Exercises 


1. Let K = Q(/—5), and let p, q, t be the ideals defined in Exercise 2 
of Chapter 5 (page 124). Let H be the class group. Show that in H 


we have 5 
pl = [9], [ella] = (9), ple] = [9], 
and hence show that p, q, t are equivalent. 
2. Verify by explicit computation that p, q, t are equivalent. 


3. Using Corollary 9.6, show that for K = Q(./—6) every ideal is equiv- 
alent to one of norm at most 3. Verify 


(2) = (2,v-6)", 
(3) = (3,/—6)’, 
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and conclude that the only ideals of norm 2, 3 are (2, V —6), (3, V —6). 
Deduce h < 3 and using (2, / —6)" = (2), or otherwise, show h = 2. 


4, Find principal ideals a, 6 in Z[./—6] such that 
a(2, /—6) = 6 (3, V—6). 


5. Find all squarefree integers d in —10 < d < 10 such that the class- 
number of Q(/d) is 1. (Hint: look up a few theorems!) 


6. Using methods similar to Exercise 3, calculate the class-number of 
Q(v4d) for d squarefree and —10 < d < 10. 


7. Suppose K = Q(/—5), p = (2,1+/—5). Let 0’ be the ring of 
integers of Q(./—5, V2). Show O’p = 9'/2. Find explicit integers 
a,b € 9’ such that 


2=V2a, 14+V—5=v2b, 


and verify that a,b are integers by computing the monic polynomials 
which they satisfy over Q. Using the notation of Exercise 1, find 
K1,K2 € 9’ such that 


D' ky = D'q, D' ke =O't 


and use the factorization (6) = pqt to factorize the element 6 in 9’. 
Explain how this factorization relates to 


6 =2-3=(14+ V—5)(1+ V—5) 
in Z[/—5]. 
8. In Z[,/—10] we have the factorizations into irreducibles 
14=2-7=(2+ V—10)(2— V—10). 


Find an extension ring 9; of Z[,/—10] and a factorization of 14 in DO; 
such that the given factorizations are found by different groupings of 
the factors. 


9. Factorize 6 = 2-3 = (44+-V10)(4—/10) € Z[V10] in an extension ring 
to exhibit the given factors as different groupings of the new ones. 


10. Relate the factorization 


10 = V5v5(v'5 + V3)(v'5 — V3) 


166 


9. Class-Group and Class-Number 


in the integers of Q(/3, V5) to the factorization of (10) into prime 
ideals in the integers of Q(/15). Explain how this gives rise to the 
different factorizations 


10 = 2-5 = (5+ V15)(5 — v5) 


into irreducibles in the integers of Q(V//15). 
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Computational Methods 


The results of this chapter, although apparently diverse, all have a strong 
bearing on the question of practical computation of the class-number, 
within the limits of the techniques now at our command. In the first sec- 
tion we study a special case of how a rational prime breaks up into prime 
ideals in a number field. The second section supplements this by showing 
that the distinct classes of fractional ideals may be found from the prime 
ideals dividing a finite set of rational primes, this set being in some sense 
‘small’ provided the degree of K and its discriminant are not too ‘large’. 
Several specific cases are studied, especially quadratic fields: in particular 
we complete the list of fields Q(v4d) with negative d and with class-number 
1 (although we do not prove our list complete). 


10.1 Factorization of a Rational Prime 


If p is a prime number in Z, it is not generally true that (p) is a prime ideal 
in the ring of integers 0 of a number field K. For instance, in Q(./—1) we 
have the factorization 


(2) = (1+ v=1)’. 


It is obviously useful to be able to compute the prime factors of (p). In the 
case where the ring of integers is generated by a single element (which in- 
cludes quadratic and cyclotomic fields), the following theorem of Dedekind 
is decisive. 
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Theorem 10.1. Let K be a number field of degree n with ring of integers 
9D = Z|6] generated by 6 € D. Given a rational prime p, suppose the 
minimum polynomial f of @ over Q gives rise to the factorization into 
irreducibles over Zp: 


faft... 


where the bar denotes the natural map Z{t] + Z,[t]. Then if f; € Zit] is 
any polynomial mapping onto f;, the ideal 


pi = (p) + (fi(8)) 


is prime and the prime factorization of (p) in O is 


(p) = pit --- Pr” 


Proof: Let 6; be a root of f; in Z,[6;] = Z,[t]/(f;). There is a natural 
map y; : Z[6] + Z,[4;] given by 


v4 (p(9)) = p(Gi). 


The image of 4% is Z,[6;], which is a field, so ker is a prime ideal of 
Z[6] = 9. Clearly 


(p) + (fi(9)) C ker 1. 


But if g(@) € kerv;, then 9(0;) = 0, so 9 = fh for some h € Z,[t]; this 
means that g — f;h € Z[t] has coefficients divisible by p. Thus 


9(9) = (9(9) — fi(@)h(8)) + fi(P)R(8) € (pv) + (F:(8)) , 
showing 
ker 4, = (p) + (fi(9)) - 
Let 
pi = (p) + (fi(9)) , 


then for each f; the ideal p; is prime and satisfies (p) C p; ie. p,| (p). 
For any ideals a, 6b), b2 we have 


(a+ b1)(a+b2) Cat bi bo, 
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so by induction 


py’... pe” & (p) + (fi(0)* ... f(8)*) 
 (p) + (F(9)) 
= (p). 


Thus (p) |py! ...p£", and the only prime factors of (p) are p1,... ,,, show- 
ing 


(p) = ptt .. per (10.1) 
where 
0<kj<e, (l<i<r). (10.2) 
The norm of p; is, by definition, |D/p;|, and by using the isomorphisms 
9/pi = Z[6]/p:i = Zp[41] 
we find 
N(pi) = |Zp[6i]| = p* 
where d; = Of;, or equivalently, d; = Of;. Also 
N ((p)) = |Z[9]/ (p))| =p”, 
so, taking norms in Equation (10.1), we find 
p” = N((p)) = N(pi)™...N(p,)*r = pa betetarke 
which implies 
dik, +...+d,k, =n=dyje,+...4+d-e,. (10.3) 
From Equation (10.2) we deduce k; = e; (1 < i < r) and this eae, 


the proof. 


This result is not always applicable, since D need not be of the form Z[6] 
in general. See Section 2.6, Example 2.23. But for quadratic or cyclotomic 
fields we have already shown that D = Z[6], so the theorem applies in these 
cases—and in many others. It also has the advantage of computability. 
Since there is only a finite number of polynomials over Z, of given degree, 
the factorization of f can be performed in a finite number of steps. A little 
native wit helps, but, if the worst comes to the worst, there are only a finite 
number of polynomials of lower degree than f to try as factors. 
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For example, in Q(/—1) we have 0 = Z[6] where 6 has minimum 
polynomial 241. 
To find the factorization of (2) we look at this polynomial (mod 2), where 
we have 


#41=(t+1)?. 
Hence (2) = p? where 
p= (2)+ (v-1+1) 
= (1+ V-1) 


(because 2 = (1 + /—1)(1 — V—1)), and we recover the example noted at 
the beginning of this section. 

More generally, consider the factorization in Z[/—1] of a prime p € Z. 
There are three cases to consider: 

(i) t? + 1 irreducible (mod p), 

(ii) +1 = (t—A)(t+A)(mod p), (where \7 = —1 (mod p)) and A # —X 
(ie. p # 2), 

(iii) 4? + 1 = (€+ 1)? (mod 2) when p = 2. 

In case (i) (p) is prime; in case (ii) (p) = pip for distinct prime ideals 
P1, P2; in case (iii) (p) = p? for a prime ideal py. 

The distinction between cases (i) and (ii) is whether or not —1 is con- 
gruent to a square (mod p). In the appendix on quadratic residues it may 
be seen that case (i) applies if p is of the form 4k — 1 (k € Z), case (ii) if p 
is of the form 4k + 1 (k € Z). 

The results in this section are, in fact, the tip of the iceberg of a large 
and significant portion of algebraic number theory. Given a prime ideal 
p in the ring Ox of integers in a number field K, we may consider the 
extension ideal Oz in the ring of integers O7 of an extension algebraic 
number field. We find 


Opp = qi... gy" 


where qi,.-. ,q, are distinct prime ideals in Dy. 


10.2. Minkowski’s Constants 


The proof of Theorem 9.5 leaves room for improvement, in that it is based 
on Lemma 9.2, which is far stronger than we really need. What we want 
is a point a such that 


|o1(a)|...|os(@)| |os41(@)|? -.- |osre(@)|? < C1... Co443 (10.4) 
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but what we actually find is a point @ satisfying the considerably greater 
restriction 


|o1(a)| < c1,..- ,|os(@)| < cs, (10.5) 
|241(@)|? < Copty+++ |os4e(@)|? < cays. 


Certainly Equation (10.5) implies Equation (10.4), but not the reverse. 

Our reason for using Equation (10.5) is that we wish to employ Min- 
kowski’s theorem. For Equation (10.5) the relevant set of points in L® is 
convex and symmetric, so the theorem applies; but for Equation (10.4) the 
relevant set, although symmetric, is not convex. This means we cannot use 
Equation (10.5) directly. The gap between Equation (10.4) and (10.5) is 
so great, however, that one might hope to find another set of inequalities, 
corresponding to a convex subset of L**, and implying Equation(10.4): this 
would lead to improved estimates in Theorem 9.5 (and Corollary 9.6). 

This can be done if we use the well-known inequality between the arith- 
metic and geometric means 


1 
(apcanly — (a1 +.-.+ Gn). (10.6) 
The result we obtain is: 


Theorem 10.2. If a # 0 is an ideal of D then a contains an element a 
with 


4 


iat s (4) - 2 VAIN) 


where n is the degree of K and A is the discriminant. 


Proof: Let X, be the set of all 2 € L** such that 


|ea|+...+ [ato] + 24/ (uf + 27) +... + 24/ (yu? + 22) <c, 


where c is a positive real number. Then X, is convex and centrally sym- 
metric, and it is a routine though non-trivial exercise to compute 


u(X,) = 2° (3) : —c" 


(using induction and a change to polar coordinates). For details of this 
computation see Lang [43] p. 116. 
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By Minkowski’s theorem, X, contains a point a # 0 of o(a) provided 
u(X_) > 2°to(T), 


where T' is a fundamental domain for o(a). By Theorem 9.4 we have 


v(T) = 2*N(a) V/A] 


so the condition on X, becomes 


t 
28 (5) 1 ons 982to-tn(a) /jA], 


“nl 


which is 
A\t 
c"’> (=) ntIN(a)/ |AI. 
For such an a we have 


IN(@)| = |o1(@) ...05(a)o541(a)?...0542(a)?| < (<)" 


by the inequality between arithmetic and geometric mean. 
Using e’s as in Theorem 9.5 we may assume that a@ can be found for 


= (2) n'!N(a)/|A] 


and then 


nals (4) Sn@vial 
O 


The geometric considerations involved in the choice of X, in this proof 
are illustrated in Figure 10.1 for the case where n = 2, s = 2,t = 0. The 
three regions 


A: |xy| <1 

B; Hl <1 

C: |z| <1, |y| <1 
correspond respectively to the inequality (10.4), the region chosen in the 
proof of Theorem 10.2, and the inequality (10.5). Note that A is not convex, 
although B, C are; that C C B C A; and that B is much larger than C 
(which is why it leads to a better estimate). 
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Figure 10.1. Geometry suggests the choice of X, in the proof of Theorem 10.2, 
here illustrated for n = s = 2,t = 0. Region B is convex, lies within region A, 
and is larger than the more obvious region C: in consequence the use of B, in 
conjunction with Minkowski’s theorem, yields a better bound. 


Corollary 10.3. Every class of fractional ideals contains an ideal a with 
na) < (4). yar 
@< (2) -Sviai. 


Proof: As for Corollary 9.6. O 


This result suggests the introduction of Minkowski constants 


Mu= (4) (s + 2t)! 


T (s + 2)s+2t" 


For future use, we give a short table of their values, taken from Lang [43]. 
(The numbers in the last column have all been rounded upwards in the 
third decimal place, to avoid under-estimates. ) 
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Ms 
0.637 
0.500 
0.283 
0.223 
0.152 
0.120 
0.094 
0.063 
0.049 
0.039 


Table 10.1. Table of Minkowski constants. 
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We can now give a criterion for a number field to have class-number 1, 
for which the calculations required are often practicable. 


Theorem 10.4. Let D be the ring of integers of a number field K of degree 
n=s+2t. Suppose that for every prime p € Z with 


Dp < Mst |A| 


(A being the discriminant of K ), every prime ideal dividing (p) is principal. 
Then D has class-number h = 1. 


Proof: Every class of fractional ideals contains an ideal a with N(a) < 


MstV/|Al. Now 
N(a) =p1.--Dk 
where p1,..., pe € Zand py < Maz,/|Al; and alN(a), so a is a product 


of prime ideals, each dividing ‘some p;. By hypothesis these prime ideals 
are principal, so a is principal. Therefore every class of fractional ideals is 


equal to [D], and h = 1. O 
Specific numerical applications of this theorem, and related methods, 

follow in the next section. 

10.3 Some Class-Number Calculations 

Theorem 10.4 combines with Theorem 10.1 to provide a useful computa- 


tional technique for fields of small degree and with small discriminant. The 
following examples show what is meant by ‘small’ in these circumstances. 
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1. Q(/—19): The ring of integers is Z[6] where 6 is a zero of 
fi) = -—t+5, 


and the discriminant is —19. Then Met./|A| < 0.63719 so The- 
orem 10.4 applies if we know the factors of primes < 2. Now we 
use Theorem 10.1: modulo 2, f(é) is irreducible, so (2) is prime in 0 
(and hence every prime ideal dividing (2) is equal to (2) so is princi- 
pal); modulo 3, f(é) is also irreducible, so (3) is prime and the same 
argument applies. 


2. Q(./—43): This is similar, but now 
fi) =? -t4+11 


and M,z,/|A| < 0.63743 which involves looking at primes < 4. But 
f(t) is irreducible modulo 2 or 3. 


3. Q(/—67): For this, 
f@®=0 -t4+17 


and M,:/|A| < 0.637,/67 which involves looking at primes < 5. But 
f (£) is irreducible modulo 2, 3 or 5. 


4. Q(/-163): Now 
f=? -t+41 


and M,:,/|A| < 0.637/163 which involves looking at primes < 8. 
But f(t) is irreducible modulo 2, 3, 5, or 7. 


Combining these results with Theorem 4.17 (or using the above methods 
for the other values of A) we have: 


Theorem 10.5. The class-number of Q(Vd) is equal to 1 for d = —1, —2, 
—3, —7, —11, —19, —43, —67, —163. O 


As we have remarked in Section 4.3, these are in fact the only values 
of d < 0 for which Q(/d) has unique factorization, or equivalently class- 
number 1. 

Comparing with Theorem 4.18 we obtain the interesting: 
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Corollary 10.6. There exist rings which have unique factorization but 
are not Euclidean; for example the rings of integers of Q(v 4) ford = 
19, —43, —67, —163. oO 


We can also deal with a few cyclotomic fields by the same method. If 
K = Q(¢) where ¢? = 1, p prime, then the degree of K is p—1, and the 
ring of integers is Z[¢]. For p = 3, K = Q(./—3) and we already know 
h =1 for this. 


5. Q(¢) where ( = 1: Here n = 4, s = 0, t = 2; and A = 125 
by Theorem 3.6. Hence MayAl < 0.152,/125 so we must look at 
primes < 1. Since there are no such primes, Theorem 10.4 applies at 
once to give h = 1. 


6. Q(¢) where ¢’ = 1: Here n = 6, s = 0, t = 3, and A = —7°. We 
have to look at primes < 3. The ring of integers is Z[¢] where ¢ is a 
zero of 


fO™=82 +O 4+4+h + +t41. 
Modulo 2, this factorizes as 
(+42 +1)(t2 +441) 


so (2) = pipe where py, pe are distinct prime ideals, by Theorem 10.1. 
In fact 


(P+ 4+1(C+¢4+1)¢4 =2, 
so we have 
(2) =( +07 +1) (8 +¢4+1) 


and $1, p2 are principal. 


Modulo 3, f(t) is irreducible (by trying all possible divisors, or more 
enlightened methods), so (3) is prime. 


Hence by Corollary 10.4, h = 1. 


Similar methods often allow us to compute h, even when it is not 1. 


7. Q(V10): The discriminant d = 40, n = 2, s = 2,t = 0. Every class 
of ideals contains one with norm 


< Mo9V/|A| < 0.540 
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so we must factorize the primes < 3. Now D = Z[6] where @ is a zero 
of 


f(t) =# -10. 


f(t) = (t+ 1)(t—1) (mod 3), so (3) = gige where gi = (3,1+ 10), 
92 = (3,1—/10). Modulo 2 we have f(t) = ¢-t so that (2) = p? 
for a prime ideal p. If p is principal, say p = (a + bv10), then the 
equation 


N(p)? = N((2)) =4 
implies that N(p) = 2. Hence 
a? — 100? = + 2. 


The latter, considered modulo 10, is impossible; hence p is not prin- 
cipal. 


We have pg, = (—2+ V10) and [g:] = [p}~?. 


This means that every class of fractional ideals either contains a prin- 
cipal ideal or p, hence equals [O] or [p]. Since p is not principal, these 
two classes are distinct, so h = 2. The class-group is cyclic of order 
2, and as verification we have 


[p]” = [p?] = [(2)] = [9]. 


As we said in Section 4.4, all the imaginary quadratic fields Q(/d) with 
unique factorization are now known, verifying a conjecture of Gauss. But 
Gauss also stated a more general conjecture, the Class Number Problem. 
This states that for any given class number h, the set of d < 0 for which 
Q(Vd) = his finite. It was proved in 1983 by Goldfeld, Gross, and Zagier, 
and is described in a masterly survey by Goldfeld [30]. 


10.4 Tables 


To give an idea of how irregularly the class-number of Q(Vd) depends 
upon d, we give a short table (Table 10.2) showing, for square-free d with 
0 <d< 100, the class-numbers h of Q(/d) and h’ of Q(/—d). 

Methods more suited to such computations than ours above exist, es- 
pecially analytic methods which are beyond our present scope. See Borevié 
and Safarevié [7] p. 342 ff. 
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h’| d h! 


d h h 

1 - 1 2 4 1 8 
2 1 1/35 2 2/70 2 4 
3 1 1/)37 1 2/71 1 «7 
5 1 2)]38 1 6/73 1 4 
6 1 2/39 2 4/74 2 10 
7 1 1/41 1 8/77 1 8 
10 2 2)42 2 4/78 2 4 
11 1 1/43 1 1/79 3 #5 
13 1 2/;/46 1 4/82 4 4 
1441 4/47 1 5/8 1 8 
15 2 2/51 2 2/8 2 4 
17 1 4/53 1 6/86 1 10 
19 1 1/455 2 4)87 2 6 
21 1 4/57 1 #47489 1 12 
22 1 2/58 2 2)91 2 2 
23 1 3/59 1 3/93 1 4 
26 2 6/61 1 6/94 1 8 
29 1 6/62 1 8/95 2 8 
30 2 4/65 2 8/97 1 4 
31 1 3]66 2 8 
33 1 4,67 1 «1 


Table 10.2. 


10.5 Exercises 


1. Let K = Q(V3). Use Theorem 10.1 to factorize the following princi- 
pal ideals in the ring © of integers of K: 


(2) , (3), (5) , (10) , (30) . 
2. Factorize the following principal ideals in the ring of integers of Q(/5): 
(2) , (3), (5) , (12) , (25) . 
3. Factorize the following ideals in Z[¢] where ¢ = e?7*/5: 
(2) , (5) , (20) , (50) . 


4. Compute the volume integral quoted in the proof of Theorem 10.2. 
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. If K is a number field of degree n, prove that 


lz 9)" (5) > 


where A is the discriminant. 


. Prove that there can exist only finitely many number fields with any 


given discriminant. 


. Using the methods of this chapter, compute the class-numbers of 


fields Q(Vd) for —20 < d < 20. 
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Kummer’s Special Case of 
Fermat’s Last Theorem 


We now have sufficient machinery at our disposal to tackle Fermat’s Last 
Theorem in a special case, namely when the exponent n in the equation 
2” +y” = z” is a so-called ‘regular prime’, and when n does not divide 
any of x,y, or z. We begin with a short historical survey to set this version 
of the problem in perspective. Following this we show how elementary 
methods dispose of the case n = 4, and reduce the problem to odd prime 
values of n. In this chapter we will not deal with the case where one of 
z,y, or z is divisible by n, neither will we deal with the case of irregular 
primes, since these cases will all be included in the full version of Fermat’s 
Last Theorem given in chapter 14. In a final discursive section we discuss 
the regularity property and some related matters. 


11.1 Some History 


The origins of Fermat’s Last Theorem have been explained in the introduc- 
tion to this book. Useful references for background reading are Stewart [70] 
and Ribenboim [58]. Fermat himself is considered to have disposed of the 
cases n = 3,4, because he issued these specific cases as mathematical chal- 
lenges to others. In fact he produced only one written proof in the whole 
of his mathematical career. This states that the area of a right-angled 
triangle with rational sides cannot be a perfect square. Algebraically, this 
statement translates to the assertion that there are no (non-zero) integer 
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solutions z, y, z of the equation x? + y? = z? where zy/2 is a square. From 
this it is easy to deduce Fermat’s Last Theorem for n = 4. 

Euler (1706-1783) produced his own proof for the case n = 3 in his 
Algebra of 1770. However, his proof contained a subtle error. He needed 
to find cubes of the form p” + 3q?, and ingeniously showed that, for any 
integers a and 6, if we define 

p= a> —9ab? q = 3(a7b — 6°) 
then 
p? + 3q? = (a? + 38)3. 

However, he then tried to show the reverse process also works, namely, 
if p* + 3q? is a perfect cube, then there exist integers a,b satisfying the 
above relationships. Here he worked with algebraic numbers of the form 
x +y/—3, with x,y integers, believing that these numbers possessed the 
same properties as ordinary integers—including uniqueness of factorization. 
(As it happens, factorization is unique in this case, but Euler did not 
realise that this needed proving.) This omission went unnoticed at the 
time. However, other results that Euler published gave an alternative proof 
for n = 3, without logical gaps, thus justifying giving him full credit for 
this case. 

Sophie Germain (1776-1831) was one of the very few women doing re- 
search in mathematics at this time. As a woman, she was unable to attend 
the Ecole Polytechnique when it opened in Paris in 1794. Instead she as- 
sumed the identity of a student, ‘Monsieur Antoine-Auguste Le Blanc’, who 
had left the course without giving formal notice. So elegant and insightful 
were her written solutions of weekly problems that her ability was noted 
by Lagrange. He insisted on a meeting, which revealed her subterfuge. He 
gave her positive encouragement, and she developed a serious interest in 
Fermat’s Last Theorem. Her contribution was in two parts. First, she 
focused on the case where n and 2n + 1 are both primes—subsequently 
such an n was called a Sophie Germain prime in her honour. The first few 
Sophie Germain primes are 2, 3, 5, 11, 23, 29, 41, 43, 83, 89, 113, 131. For 
such a prime she showed that if 2” + y” = z” has a solution, then one of 
x,y,z must be a multiple of n. To do so, she subdivided her search for a 
solution into two cases: 


1. None of x, y, z is divisible by n 
2. Only one of z, y, z is divisible by n 


(If two of x,y,z are divisible by n then all three are, and if all three are, 
then the common factor n” can be divided out, so only these cases are 
needed for a proof.) 
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Around 1825, she proved case (1) of Fermat’s Last Theorem for such 
primes, and Legendre generalized this to odd primes p such that kp + 1 
is prime, where k = 4, 8, 10, 14 and 16. Attention then turned to case 
(2). A partial proof of this case for n = 5 was presented to the Paris 
Academy by Dirichlet in July 1825. Legendre filled in the other details 
in September 1825, hence completing the full proof for n = 5. Dirichlet 
continued to work on the case n = 7, only to realise that the closely related 
case n = 14 was more amenable to his methods. He published the proof 
for n = 14 in 1832. The case n = 7 was finally proved in 1839 by Gabriel 
Lamé (1796-1870). It required far more subtle computations than those 
of earlier cases and gave the impression that further progress was unlikely 
unless a completely different line of attack was found. The next major step 
forward was followed by an immediate retreat. 

On 1 March 1847, Lamé addressed the Paris Academy and announced 
a complete proof of Fermat’s Last Theorem. He outlined a proof which 
introduced the complex nth roots of unity and factorized the equation 
a2” +y”" = z” into linear terms 


where ¢ = e?**/" and n is odd. To prove this, consider 2” — y” as a 
polynomial in x with coefficients in C[y]. It is zero when 2” = y”, that 
is, z = C*y for 0 < k < n—1. Therefore, by the Remainder Theorem, 
x” —y” is divisible by x— ¢*y for 0 < k < n—1. It is therefore divisible by 
(x—y)(x—Cy) ...(z—¢"—1y), but this has degree n and leading coefficient 1, 
just like x” — y”, so the two are equal. Now replace y by —y and remember 
that n is odd. 

Lamé acknowledged that he was indebted to Liouville for this idea. 
Then Liouville took the stage and acknowledged his contribution, but 
pointed out that the argument used depended on uniqueness of factor- 
ization. He went on to explain that he suspected that this property might 
fail. Immediately the focus was turned on uniqueness of factorization. A 
fortnight later Wanzel, a member of the Academy, had his short time of 
fame. On 15 March he announced a proof of uniqueness of factorization for 
some cases, by providing arguments for n = 2,3,4. He also stated that his 
method of proof failed for n = 23 (see Cauchy [12] p.308). On 24 May Li- 
ouville informed the Academy that Kummer had already shown the failure 
of unique factorization three years before, but had developed a technical 
alternative that worked by introducing what he called ‘ideal numbers’. In 
1850 Kummer produced his sensational proof of Fermat’s Last Theorem 
for what he termed ‘regular’ primes, including all primes less than 100 ex- 
cept for 37, 59, 67. Kummer asserted that there is an infinite number of 
regular primes, but this has never been proved (although Jensen proved 
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that there is an infinite number of irregular primes in 1915). The same 
year, Kummer attended to the three cases 37, 59, 67, but made errors that 
went unnoticed until Vandiver found them in 1920. A proof for n = 37 
was given by Mirimanoff in 1893, and he extended this as far as n < 257 in 
1905. Vandiver laid down methods that made a computational approach 
possible, which led to proofs for n < 25,000 by Selfridge and Pollock [66], 
then n < 125, 000 by Wagstaff [78]. By 1993 the record was n < 4, 000, 000, 
see Buhler e¢ al. [10]. 


The proof by Kummer therefore occupies a pivotal position in the devel- 
opment of Fermat’s Last Theorem. It changed the focus from increasingly 
complicated ways of dealing with small values of n, using a variety of meth- 
ods, before 1850, towards a more general proof for a wide variety of values 
of n in the late 19th and 20th centuries. Before moving on to entirely new 
techniques, it is therefore worth taking a detailed look at Kummer’s proof 
and the methods behind it. 


11.2 Elementary Considerations 
We consider what can be said about the Fermat equation 
x+y" = 2" (11.1) 


from an elementary point of view. If a solution to Equation (11.1) exists 
then there must exist one solution in which z, y, z are coprime in pairs. 
For if a prime g divides z and y, then xz = gz’, y = qy’, 


qh(a'™ + y!") = 2" 


so that g also divides z, say z = gz’, and then 2’ + y’" = 2’". Similarly 
if qg divides x and z, or y and z. In this way we can remove all common 
factors from z, y, z. 


Next note that if Equation (11.1) is impossible for an exponent n then 
it is impossible for all multiples of n. For if 2” + y™" = z™ then 
(2™)" + (y™)” = (z™)". Now any integer > 3 is divisible either by 4 or by 
an odd prime. Hence to prove (or disprove) the conjecture tt is sufficient 
to consider the cases n = 4 and n an odd prime. 


We start with Fermat’s proof for n = 4. It is based on the (well known) 
general solution of the Pythagorean equation x? + y? = z?, given by: 
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Lemma 11.1. The solutions of x? +y? = z? with pairwise coprime integers 
Z, Y, 2 are given parametrically by 


+z = r*—s? 
ty = 2rs 
tz = r?+s? 


(or with x, y interchanged) where r, s are coprime and exactly one is odd. 


Proof: We shall give the classical proof. It is sufficient to consider 2, y, 
z to be positive. They cannot all be odd, for this gives the contradiction 
‘odd + odd = odd’. Since they are pairwise coprime, this means precisely 
one is even. It cannot be z, for then z = 2k, x = 2a+1, y = 2b+ 1 where 
k, a, b are rational integers, and 


(2a + 1)? + (2b +1)? = 4k?. 


This cannot occur since the left-hand side is clearly not divisible by 4 whilst 
the right-hand side is. So one of z, y is even. We can suppose that this is 


xz. Then ‘ " ‘5 

a= 2° —y" = (z+ y)(z— 9). 
Because xz, z+ y, z — y are all even and positive, we can write x = 2u, 
z+y=2v, z-—y = 2w, whence 


(2u)? = 2u - 2w, 


or 
w=vw. (11.2) 


Now v, w are coprime, for a common factor of v, w would divide their sum 
u+w = z and their difference v — w = y, which have no proper common 
factors. Factorizing u, v, w into prime factors, we see that (11.2) implies v, 
w are both squares, say v = r?, w = s*. Moreover r, s are coprime because 
v, w are. 


Thus 
z2 = vt+wer'*+s’, 


y = v—w=r'—s?, 


Because y, 2 are both odd, precisely one of r, s is odd. Finally 
a? = 22 =: y? = (r? 4 so) ibs: (r? _ s*)? = Ar? 5”, 


so 


x= 2rs. O 
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Now we can prove a theorem even stronger than the impossibility of 
Equation (11.1) for n = 4, namely: 


Theorem 11.2. The equation x* + y* = z* has no integer solutions with 
2z,y,z #0. 


Proof: First note that this is stronger, since if 2 + y* = z* then z, y, 2” 
satisfy the above equation. 
Suppose a solution of 


git+yt=2 (11.3) 


exists. We may assume x,y,z are positive. Among such solutions there 
exists one for which z is smallest: assume we have this one in (11.3). Then 
x,y, Z are coprime (or we can cancel a common factor and make z smaller) 
and so by Lemma 11.1 we have 


g?=r?—s?, y?=2rs, z=4r24+58?, 

where xz, z are odd and y is even. The first of these implies 

g+s* =r? 
with z,s coprime. Hence by Equation (11.1) again, since z is odd 

g=—a’?—b?, s=2ab, r=a?+b?. 
But now we substitute back to get 
y? = 2rs = 2+ 2ab(a? + b?) 
so y is even, say y = 2k, and 
k? = ab(a? + b?). 
Since a, 6 and a” + 6? are pairwise coprime we have 
a=c*, b=d*, a*+b* =e?, 

so that 

ci +d* =e’. 


This is an equation of type (11.3), but e < a? +b* =r < z, so we have 
contradicted the minimality of z. O 
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11.3 Kummer’s Lemma 


This section begins our build-up to the solution of a special case of Fermat’s 
Last Theorem, with a detailed study of the field K = Q(¢) where ¢ = e?7#/P 
for an odd prime p. As in Chapter 3 we write 


A=1-¢. 
Further we define 
[= (A) ’ 


the ideal generated by in the ring of integers Z[¢] of K. We start with 
some properties of [. 


Lemma 11.3. (a) (?-1 = (p), (b) N() =p. 


Proof: First note that for 7 = 1,... ,p—1 the numbers 1 — ¢ and 1— 
are associates in Z[¢]. Clearly 1 — ¢|1 — ¢4. But if we choose ¢ such that 
jt =1 (mod p) then 1—¢ = 1— ¢* so that 1— ¢4|1—¢. Hence they are 
associates. 

Now Equation (3.9) of Chapter 3 leads to 


p-1 


(p) = [[ a -¢) 


j=l 
but the above remarks show that (1—¢/) = (1—¢) =|, so that 
(=> 
and (a) is proved. Part (b) is immediate on taking norms. Oo 
A useful consequence of (b) should be noted. It implies that |Z[¢]/l| = p, 
from which it follows (on looking at the natural homomorphism 
Z[¢] + Zl[¢]/0) that every element of Z[¢] is congruent modulo [ to one 
of 0,1,2,...,p—1. 
The main aim of the rest of this section is to give a useful, though 


incomplete, description of the units of Z[¢]. We start by finding which 
roots of unity occur, showing that there are no ‘accidental’ occurrences: 


Lemma 11.4. The only roots of unity in K are &¢° for integers s. 


Proof: First we show i ¢ K. Ifi € K then since 2 = i(1 — i)? we have 
(2) = 1-4)’. 
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Hence when (2) is resolved into prime factors in Z[C] it has repeated factors. 
By Theorem 10.1 this implies that the polynomial 


f(t) — 


P_] 

t-1 
has a repeated irreducible factor modulo 2, hence that t? —1 has a repeated 
irreducible factor modulo 2. Then the remark following Theorem 1.5 tells 
us that ¢? —1 and D(t? — 1) = pt?~! are not coprime. However, p is odd, so 
these polynomials modulo 2 take the form ¢? + 1, t?-! which are obviously 
coprime. This is a contradiction. 

In exactly the same way we can show that for an odd prime g # p, 


e2t/a ¢ K. 


We just use 


(a) =(1- eamisa\e ; 
Next we remark that 
e2tt/p” ¢ K 


For ¢27#/P” satisfies t?” — 1 = 0, but not t? — 1 =0 and so is a root of 


-1 
f(t) =( -H/(@-1)= ye? =0. 
r=0 


By applying Eisenstein’s criterion to f(t + 1), a little arithmetic shows 
f(t+1), and hence f(t), is irreducible. Thus f is the minimum polynomial 
of e2"4/P”. Since [K : Q] = p—1 it follows from Theorems 1.10 and 1.11 
that e27!/P” ¢ K, 

Suppose now that e?**/™ € K for an integer m. Then the above results 
show that 


4tm, atm, ptm. 
Hence m|2p which leads at once to the desired result. O 


Lemma 11.5. For each a € Z[¢| there exists a € Z such that oP = 
a(mod [?), 


Proof: We have already remarked on the existence of b € Z such that 
a=b (mod [). Now 


p-1 
a? — = | [ (a— ¢70) 
j=0 
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and since ¢ = 1 (mod J) each factor on the right is congruent to a-—b=0 
(mod f). Multiplying up, a? — b? = 0 (mod I?). 0 
Next comes a curious result about polynomials and roots of unity: 


Lemma 11.6. If p(t) € Z[t] is a monic polynomial, all of whose zeros in 
C have absolute value 1, then every zero is a root of unity. 


Proof: Let ai,...,a@% be the zeros of p(t). For each integer | > 0 the 
polynomial 


pi(t) = (t— ah)... (t- a) 
lies in Z[¢] by the usual argument on symmetric polynomials. Now if 
pi(t) = t* + ay_it®-1 +... +49 


then 
k P 
sls (5 ) (j =0,...,k—1) 


by estimating the size of elementary symmetric polynomials in the a; and 
using |a;| = 1. But only finitely many distinct polynomials over Z can 
satisfy this system of inequalities, so for some m # | we must have 


pi(t) = Pm(t). 
Hence there exists a permutation m of {1,... ,&} such that 
as = OF) 
for j =1,... ,k. Inductively we find that 


i” _ om” 
Oj; = Anr(z) 


and so, since 1*'(j) = j, we have al = arn and hence 
(U8 mB) 
- =, 
Since 1! 4 m#! it follows that a; is a root of unity. O 


Now we may prove the main result of this section, known as Kummer’s 
lemma: 


Lemma 11.7. Every unit of Z[¢] is of the form r¢9 where r is real and g 
is an integer. 
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Proof: Let « be a unit in Z[¢]. There exists a polynomial e(¢) € Z[é] such 
that « = e(¢). For s=1,...,p—1 we have 


€s = e(¢°) 


conjugate to e«. Now 1 = + N(e) = + €)...€-1, so each €, is also a unit. 
Further, if bars denote complex conjugation, we have 


Eps = e(¢P*) = e(C*) = e(C*) = e(C*) = &. 


Therefore 
66)-4:= les|” > 0. 
Then 
+1 = N(e) = (€1€p-1)(€2€p_2)... > 0 


so that N(e) = 1. 
Now each ¢€,/€p_, is a unit, of absolute value 1, and by a symmetric 


polynomial argument na 


1 
(t — €s/€p-«) 


s=1 


has coefficients in Z. By Lemma 11.6 its zeros are roots of unity. An appeal 
to Lemma 11.4 yields the equation 


€/€p—1 = +¢" 
for integer u. Since p is odd either u or u+ p is even, so we have 
€/€p—1 = £079 (11.4) 


for0<g€Z. 

The crucial step now is to find out whether the sign in (11.4) is positive 
or negative. To do this we work out the left-hand side modulo [, as follows: 
we know that for some v € Z 


¢-%=v (mod I) 
whence by taking complex conjugates 
¢%p-1 =v (mod (A)). 


But \ = 1— ¢?~1 is an associate of , so in fact (A) = |. Eliminating v 
leads to 


€/6-1 = ¢29 (mod f). 
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With a negative sign in Equation (11.4) we are led to 
(|2¢29 
and hence, taking norms, 
N(0)|2?-+ 
which contradicts Lemma 11.3(b). So the sign in (11.4) is positive. Hence 
¢ %e = C%€p_-1. 


The two sides of this equation are complex conjugates, so are in fact real; 
hence ¢~%e = r € R which proves the lemma. Oo 


11.4. Kummer’s Theorem 


In order to state Kummer’s special case of Fermat’s Last Theorem, we need 
a technical definition. A prime p is said to be regular if it does not divide 
the class-number of Q(¢), where ¢ = e?"*/?. By Section 10.3, p = 3,5,7 are 
regular. Further discussion of the regularity property is postponed until 
Section 11.5, for we are now in a position to state and prove: 


Theorem 11.8. If p is an odd regular prime then the equation 
xP + yP = 2P 
has no solutions in integers x,y, z satisfying 


p{z, pty, ptz. 


Proof: We consider instead the equation 
xP +y? +2? =0 (11.5) 


which exhibits greater symmetry. Since we can pass from this to the Fermat 
equation by changing z to —z, it suffices to work on Equation (11.5). We 
assume, for a contradiction, that there exists a solution (z, y, z) of (11.5) 
in integers prime to p. We may as usual assume further that z,y,z are 
pairwise coprime. We may factorize (11.5) in Q(¢) to obtain 


p-1 


Il (x+y) = —2z?P 


j=0 
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and pass to ideals: 
p-l1 
[] (e+ ¢4v) =. (11.6) 
jg=0 


First we establish that all factors on the left of this equation are pairwise 
coprime. For suppose p is a prime ideal dividing (x + Fy) and (x + chy) 
with O<k <1l<p-—1. Then p contains 


(x + Cy) — (2+ ¢ly) = yc*®(1—¢'*). 


Now 1 — ¢'-* is an associate of 1— ¢ = A, and ¢* is a unit, so p contains 
yA. Since p is prime either ply or p|X. In the first case p also divides z from 
Equation (11.6). Now y and z are coprime integers, so there exist a,b € Z 
such that az+ by = 1. But y,z € p so 1 € p, acontradiction. On the other 
hand, since N(l) = p it follows from Theorem 5.14 (a) that [ is prime; so if 
p|A then p = [. Then [|z so 


p=N()|N(z) = 2?" 


and p|z contrary to hypothesis. 

Uniqueness of prime factorization of ideals now implies that each factor 
on the left of Equation (11.6) is a pth power of an ideal, since the right- 
hand side is a pth power and the factors are pairwise coprime. In particular 
there is an ideal a such that 


(z+ ¢y) =a?. 


Thus a? is principal. Regularity of p means that p { h, the class-number of 
Q(¢), and then Proposition 9.8(b) tells us that a is principal, say a = (6). 
It follows that 


z+ Cy = 6d? 


where € is a unit. 
Now we use Lemma 11.7 to conclude that 


a+ Cy =rC96? 
where r is real. By Lemma 11.5 there exists a € Z such that 
6? =a (mod I?). 
Hence 


z+¢y=rac? (mod [?). 
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Lemma 11.3 (a) shows that (p) | [?, so 
x+¢Cy=rac% (mod (p)). 

Now ¢—9 is a unit, so 

¢9(a@+¢Cy)=ra (mod (p)). 
Taking complex conjugates leads to 

(9(@+¢""y) =ra (mod (p)) 
and so, eliminating ra, we obtain the important congruence 

2-8 + (9 — 2(7 —y(7-? =0 (mod (p)). (11.7) 


Observe that 1+ ¢ is a unit (put t = —1 in Equation (3.3)). We 
investigate possible values for g in Equation (11.7). 

Suppose g = 0 (mod p). Then ¢9 = 1, the terms with x cancel, and 
(11.7) becomes 


y(¢—¢~") =0 (mod (p)) 


and so 


y(1+¢)(1—¢)=0 (mod (p)). 


Since 1+ ¢ is a unit, 
yA =0 (mod (p)). 


Now (p) = (A)?-* and p—1 > 2, so we have Aly. Taking norms, ply, 
contrary to hypothesis. Hence g # 0 (mod p). A similar argument shows 
that g # 1 (mod p). 

We rewrite (11.7) in the form 


ap = 26-9 + yCh9 — 29 — C9 
for some a € Z[¢]. Note that by the previous paragraph no exponent 
—g,1-— 9,9,9 —1 is divisible by p. We have 
x 


pe Z¢-9 ee Yei-s erage: Yeg-1, (11.8) 
Dp Dp Dp Dp 


Now a: € Z[¢] and {1,¢,... ,¢?~?} is a Z-basis. Hence if all four exponents 
are incongruent modulo p we have z/p € Z, contrary to hypothesis. So 
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some pair of exponents must be congruent modulo p. Since g # 0,1 (mod 
p) the only possibility left is that 2g = 1 (mod p). 
But now (11.8) can be rewritten as 


apt = a+ yC— 2679 — y(?9-? 
(x —y)d. 


Taking norms we get p|(x — y), so 
z=y (mod p). 
By the symmetry of (11.5) we also have 
y=z (mod p) 
and hence 
O=2P +y? +2? = 32? (mod p). 


Since p{ x we must have p = 3. 

It remains to deal with the possibility p = 3. Note that modulo 9, cubes 
of numbers prime to p (namely 1, 2, 4, 5, 7, 8) are congruent either to 1 
or to —1. Hence modulo 9 a solution of (11.5) in integers prime to 3 takes 
the form 


+1+1+1=0 (mod 9) 


which is impossible. Hence finally p # 3 and we have a contradiction. O 


A complete solution of Fermat’s Last Theorem (for regular primes!) is 
thus reduced to the case where one of a, y, or z is a multiple of p. Kummer’s 
proof of this case also depends heavily on ideal theory and, although long, 
would be accessible to us at this stage, except for one fact. We need to 
know that (still for p regular) if a unit in Q(¢) is congruent modulo p toa 
rational integer, then it is a pth power of another unit in Q(¢). The proof 
of this requires new methods. It seems best to refer the reader to Borevié 
and Safarevié [7] pp. 378-81 for the missing details. 


11.5 Regular Primes 


Theorem 11.8 is, of course, useless without a test for regularity. There is, 
in fact, quite a simple test, but once more the proofs are far beyond our 
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present methods. We shall nonetheless sketch what is involved, and again 
refer the reader to Borevié and Safarevié [7] for the details. 

Everything rests on a remarkable gadget known as the analytic class- 
number formula. Let K be a number field, and define the Dedekind zeta- 
function 


C(x) = }IN(a)* 


where a runs through all ideals of the ring of integers D of K, and for the 
moment 1 < z < oo. One then proves the formula 

: asttatR 
in which s and ¢t are the number of real, or complex conjugate pairs of, 
monomorphisms K —> C; m is the number of roots of unity in K; A is the 
discriminant of K; R is a new constant called the regulator of K; and h is 
the class-number. 

The point is that nearly everything on the right, except h, is quite easy 
to compute (R is not: it is much harder than the rest): if we could evaluate 
the limit on the left we could then work out h. To evaluate this limit we first 
extend the definition of ¢x(z) to allow complex values of z, and then use 
powerful techniques from complex function theory. These involve another 
gadget known as a Dirichlet L-series. 

In the case K = Q(¢) for ¢ = e?**/?, p prime, the analysis leads to an 
expression for h in the form of a product 


h=hyho. 


In this, h2 is the class-number of the related number field Q(¢ + ¢~1), and 
h, is a computable integer. This would not be very helpful, except that it 
can be proved that if h, is prime to p, then so is hg. (Therefore h is prime 
to p, or equivalently p is regular, if and only if hy is prime to p.) 

Analysis of h; leads to a criterion: h, is divisible by p if and only if one 
of the numbers 


p-1 
Sp = Sint (k = 2,4,... ,p—3) 
n=1 
is divisible by p?. 


The numbers 5; have long been associated with the Bernoulli numbers 
By defined by the series expansion 
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Their values behave very irregularly: for m odd # 1 they are zero, for 


m= 1 we have B; = -4, and for even m the first few are: 
1 1 1 
Bo = ra Bs= —30° Be = rou 
1 5 691 

B = -— Bio = — BFS 

30° >t? 66°. = ora? 
B = ls = _ 3617 

14 = & 16 =F: 


The connection between the S; and the B, may be shown to give: 


Criterion 11.9. A prime p is regular if and only if it does not divide the 
numerators of the Bernoulli numbers Bo, B4,... , Bp-3.- O 


The first 10 irregular primes, found from this criterion, are 37, 59, 67, 
101, 103, 131, 149, 157, 233, 257. As a check, it is possible to compute the 
number h 1, with the following results: 


hi 


211 
47 5-139 

53 4889 

59 3-59 - 233 
61 41-1861 
67 67 - 12739 
71 72.79241 
73 89 - 134353 


WRRPRP RP RP RR 


D 
3 
5 
7 
11 
13 
17 
19 
23 
29 
31 
37 
41 


23°) 79 5-53- 377911 

37 | 83. 3 - 279405653 

37 | 89 113 - 118401449 
11? | 97 577 - 3457 - 206209 


Observe that h is divisible by p exactly in the cases p = 37, 59,67 (marked 
in bold type) as expected. 


11.6 Exercises 
1. If x, y, z are integers such that x? + y? = z”, prove that at least one 


of x, y, z is a multiple of 3, at least one is a multiple of 4, and at 
least one is a multiple of 5. 
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. Show that the smallest value of z for which there exist four distinct 


solutions to x? + y? = z? with z, y, z pairwise coprime (not counting 
sign changes or interchanges of x, y as distinct) is 1105, and find the 
four solutions. 


. Show that there exist no solutions in non-zero integers to the equation 


x? + y? = 323. 


. Show that the general solution in rational numbers of the equation 


e+y=w 403 


z = k(1—(a—3b)(a? + 37)), 
y = k((a+3b)(a? + 3b?) — 1), 
u = k((a+3b) — (a? + 30*)?), 
v = k((a? +367)? — (a — 36)), 
where a, b, k are rational and k # 0; or x = y = 0, u = —; or 


L=uUu,y=v, orz=v, y=u. (Hint: writer = X—-Y,y=X+H+Y, 
u=U-V,v=U+4+YV, and factorize the resulting equation in 


Q(Vv=3).) 


. For p an odd prime, show that if ¢ = e?"*/?, then 


1-¢ 1-¢> 
(= Te) 


is a real unit in Q(¢) for s=1,2,...,p—1. 


. Let p be an odd prime, ¢ = e?7*/P. Kummer’s lemma says that 


the units of Z[¢], thought of in the complex plane C, lie on equally 
spaced radial lines through the origin, passing through the vertices 
of a regular p-gon (namely the powers ¢*). Now 1+ ¢ is a unit, so 
why does Figure 11.1 (page 200) not contradict Kummer’s lemma? 
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Figure 11.1. Why doesn’t this contradict Kummer’s lemma? 


12 


The Path to the 
Final Breakthrough 


In the late 19th and early 20th centuries, the study of Fermat’s Last Theo- 
rem built mainly on Kummer’s methods, with the notion of ‘ideal numbers’ 
being supplanted by Dedekind’s theory of ‘ideals’ in a commutative ring. 
The techniques required a high degree of mathematical and computational 
facility, and were applied to more and more special cases. For instance, in 
1905 Mirimanoff extended Kummer’s results as far as n < 257. In 1908 
Dickson generalized the theories of Germain and Legendre by investigat- 
ing z™ + y” = z” in the case where (n is prime and) none of z,y,z is 
divisible by n. He proved that in this case the equation cannot have any 
solution if n is a prime of the form n = kp +1, where p is also prime 
and k = 20, 22, 26, 28, 32, 40, 56, 64 (with a few small exceptions). Fermat’s 
Last Theorem was proving to have a nasty sting in its tail. Despite the ap- 
parently simple statement of the problem, the proofs of special cases were 
becoming ever more complex, requiring the highly specialized activity of 
mathematical experts. 


12.1 The Wolfskehl Prize 


In 1908 the situation changed dramatically, and the problem was opened 
up to other mathematicians and to a wider world of amateurs. The agent 
of change was Paul Friedrich Wolfskehl, the son of a wealthy Jewish banker; 
he was born in Darmstadt in 1856. He first studied medicine, obtaining 
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his doctorate in 1880. However, debilitating multiple sclerosis made it 
impossible for him to practice surgery, and in 1880 he turned instead to 
mathematics. He began this activity in Bonn, but moved to Berlin the fol- 
lowing year, where he attended the lectures of the 72-year old Kummer. So 
fascinated did Wolfskehl become with the still unproved Last Theorem of 
Fermat that he left 100,000 marks in his will, to be awarded to the first per- 
son either to prove the theorem or to give a counterexample, see Barner [4]. 
In today’s currency, this would be worth around 1.7 million dollars. The 
prize was announced by the Royal Society of Science in Géttingen, on 13 
September 1908, exactly two years after Wolfskehl’s death: it was to be 
claimed on or before 13 September 2007. Although hyperinflation in Ger- 
many in the 1920s greatly diminished the value of the bequest, it was still 
valued at 75,000 marks in modern currency at the start of the 21st century, 
thanks to judicious investment. 

Wolfskehl’s act of altruism proved a mixed blessing to the mathemati- 
cal community. In the first year alone, 621 solutions were submitted, and 
although the frequency slowly decreased, attempted solutions continued to 
flow in for the next ninety years. The total number sent to the Géttingen 
Academy has been estimated at over 5,000, and each attempt had to be 
read and considered by one of the judges. The endless succession of ‘proofs’ 
of Fermat’s Last Theorem kept the staff and assistants involved continually 
busy. Not only did they have to deal with problems regularly: they could 
also become involved in protracted correspondence in addition to the initial 
reply. One correspondence on record extended to over sixty communica- 
tions. 

Other universities did not escape the burden. At the Royal Society 
of Science in Berlin the numerous attempted proofs were dealt with by a 
single individual, Albert Fleck, who courteously replied to each aspirant, 
highlighting the error in the manuscript and succinctly explaining the mis- 
take. 

Sometimes the solutions were put forward by eminent mathematicians. 
Ferdinand Lindemann (1852-1939), who is famous for his proof of the tran- 
scendence of 2, published a fallacious proof of Fermat’s Last Theorem in 
1901. He soon withdrew it, but he continued his efforts with a 64-page 
paper in 1908. Fleck showed him his error on pages 23 and 24, render- 
ing the remainder of the enterprise worthless. Fleck was a true ‘amateur’ 
who loved his work: his ‘Fermat Clinic’ at the Berlin Academy consisted 
solely of himself, at his desk in his room in the Mathematics Department. 
For these efforts, Fleck was awarded the Leibniz silver medal of the Berlin 
Mathematical Society in 1915, and he continued in this task until his death 
in 1943. As a Jew in Nazi Germany, his final years were blighted by per- 
secution and humiliation. As the 20th century continued, the volume of 
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solutions diminished, but they continued to arrive at intervals from all 
corners of the world. When the Berlin wall was removed, the number of 
solutions from Eastern Europe suddenly increased, because academics from 
the former Soviet Union were once more able to communicate freely with 
the West. 

Despite the large number of attempted proofs, the actual advances in 
the early 20th century were prosaic and highly technical. In 1909, Wieferich 
focused on the case where p has no factors in common with 2, y,z, and 
proved that if there is a solution then the condition 


2?-1 = 1(mod p”) 


must be satisfied. Relationships like this became much more useful with 
the arrival of computers in the middle of the 20th century, because it was 
then practical to check them for large p. The American mathematician 
Harry Schultz Vandiver (1882-1973) introduced methods that made a com- 
putational approach to the full theorem possible for any specific p (not too 
large). He had little formal education, and left school early to work in his 
father’s firm. In 1904 he collaborated with the 20-year old George Birkhoff 
in a paper on the factorization of integers of the form a” — b", becoming 
yet another in the long line of amateurs who were fascinated with number 
theory. He took a university appointment in 1919, and worked extensively 
on Fermat’s Last Theorem: he was awarded the Cole Prize of the American 
Mathematical Society in 1931 for this work. His findings built on the work 
of Kummer, and were particularly amenable to computation. In 1952, at 
the age of seventy, he used a computer to prove Fermat’s Last Theorem 
for n < 2,000. The value of n continued to be raised at intervals over the 
years. In 1976, Wagstaff proved the theorem for n < 125,000, and by 1993 
subsequent computations by others had raised this to n < 4,000,000, see 
Buhler et at. [10]. 

The methods continued to involve heavy calculations, making painful 
step-by-step progress without any simple fundamental insight that ad- 
dressed the whole problem in a truly conceptual way. The proof was re- 
markably elusive. It seemed that the Wolfskehl Prize would be unclaimed 
in the few years left before time ran out in 2007. 


12.2 Other Directions 


Meanwhile, mathematics was continung to grow in other directions, which 
seemed at the time to have nothing whatsoever to do with Fermat’s Last 
Theorem. However, history is littered with cases where mathematicians at- 
tempting to solve one problem ended up by formulating and proving some- 
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thing quite different. Indeed, Kummer’s original breakthrough in his proof 
of many cases of Fermat’s Last Theorem occurred when he was working 
on a totally different problem in the generalized theory of quadratic reci- 
procity. In the same manner, the ingredients that were to lead Wiles to the 
final proof of Fermat’s Last Theorem arose in areas which, at first, seemed 
to have no possible link with it. 

In the last decade of the 19th century, Henri Poincaré developed the new 
theory of algebraic topology in his book Analysis Situs (1895). He invented 
ways to translate topological problems into algebraic form. He classified 
surfaces in terms of their ‘fundamental group’ which, among other things, 
gives information about the number of ‘holes’ in the surface and relates 
this number to an integer called the ‘genus’. A sphere with no holes has 
genus 0, a torus has genus 1, and other surfaces with ‘more holes’ have 
genus g > 2. 

Initially, this idea seemed to have no relationship with Fermat’s Last 
Theorem. However, there is a connection. An integer solution of Fermat’s 
equation, say a”+b" = c”, corresponds to the rational solution x = a/c, y = 
b/c of the polynomial equation 


a” +y"-1=0. (12.1) 


Therefore Fermat’s Last Theorem is equivalent to showing that this poly- 
nomial equation has no rational solutions. The Cambridge mathematician 
Louis Mordell had the bright idea of looking not only at the rational solu- 
tions of a polynomial equation Q(z, y) = 0 with rational coefficients, but 
also at its complex solutions. Topologically, the complex solutions of (12.1) 
are related to a surface whose genus happens to be (n — 1)(n — 2)/2. For 
n > 4, the genus is therefore 2 or more. In 1922 Mordell formulated what 
is now called the Mordell Conjecture: a polynomial equation Q(z, y) = 0 
with rational coefficents and genus g > 2 has only finitely many rational 
solutions. If this could be proved, then it would immediately follow that 
the Fermat equation a” + b” = c” (n> 4) has at most a finite number of 
solutions. 

At first this seems not to carry the Fermat quest very far forward. To 
start with, it is an unproved conjecture. Even if it were proved, it would 
only show that the equation has a finite number of solutions, when what 
we actually wish to show is that there are none. Nevertheless, the Mordell 
Conjecture turned out to be an important step towards the final proof of 
Fermat’s Last Theorem. Early work of Weil [79] led to significant progress 
in special cases, and the full Mordell Conjecture was finally proved by 
Gerd Faltings in 1983, see Bloch [6]. The proof was immediately followed 
by new results. In 1985, two different papers were published to confirm 
that Fermat’s Last Theorem is true for ‘almost all’ n. Both Granville and 
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Heath-Brown showed that the proportion of those n for which Fermat’s 
Last Theorem is true tends to 1 as n becomes large. It is a remarkable 
result, but still not the full proof that was so eagerly being sought. 


12.3. Modular Functions and Elliptic Curves 


Other ideas of Poincaré also proved to be seminal in the proof of Fermat’s 
Last Theorem, although again the link was not obvious when they were first 
introduced. As a visual thinker, Poincaré loved to study systems that have 
symmetry. An area of particular interest was that of symmetries in complex 
function theory. He studied complex functions f(z) that remain invariant 
when their domains are operated on by a complex transformation z +> (az+ 
b)/(cz+d) for integers a, b,c, d such that ad—bc = 1. These transformations 
are called Mobius transformations under composition. Note that when 
z = —d/c, the image under the transformation is infinite, so that to obtain 
a more satisfactory theory, it is best to adjoin the point at infinity to the 
complex plane to give a surface that is topologically like the surface of a 
sphere. Functions that are invariant under all Mébius transformations are 
called automorphic. 

Poincaré went further, and considered those functions transforming the 
upper half plane (z = x + ty where y > 0) that remain invariant under the 
same kind of transformations. Adding one or two technical conditions, he 
developed a theory of modular functions. These will be studied in Chap- 
ter 13. For the moment, it is sufficient to know that modular functions 
have certain properties that eventually made them a pivotal idea in the 
proof of Fermat’s Last Theorem. 

The introduction of complex numbers into the study of Fermat’s Last 
Theorem—particularly the study of polynomial equations with rational co- 
efficients, as in the Mordell Conjecture—proved to play another important 
role. This focuses on elliptic curves—curves defined by the formula 


y? = Az® + Bz? ++Czr+D 


where A, B,C,D are all rational. The trick that opens up a route to a 
proof of Fermat’s Last Theorem can be formulated as follows: imagine 
this equation for complex x and y, and to attempt to parametrize it with 
functions x = f(z), y = g(z) satisfying the equation 


g(z)” = Af(z)’ + Bf(z)? + Cf(z) +D. 


However, this point of view on the ideas involved is a fairly recent one: 
the early formulations were stated in more technical ways. See Rubin and 
Silverberg [64]. 
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12.4 The Taniyama-Shimura—Weil Conjecture 


In 1955, a highly significant step was taken by two Japanese mathemati- 
cians, who were planning a conference in Tokyo on algebraic number the- 
ory. Yutaka Taniyama, was interested in elliptic curves. He had a powerful 
intuitive grasp of mathematics but was prone to making errors. But his 
friend Goro Shimura, who was a much more formal mathematician, re- 
alised that Taniyama had an instinctive ability to imagine new relationships 
that were not available to more careful thinkers. At their conference they 
presented a number of problems for consideration by the participants. Four 
of these proposed by Taniyama dealt with possible relationships 
between elliptic curves and modular functions. From these developed 
what became known as the Taniyama-Shimura—Weil Conjecture. This 
conjecture hypothesized that every elliptic curve can be parametrized by 
modular functions. (Its technical statement was different, and only much 
later was it reinterpreted in this way as a result of other discoveries in the 
area.) 

At the time this was a surprising idea to most workers in the field, who 
saw elliptic curves and modular functions as inhabiting quite different parts 
of mathematics, so at first the conjecture was not taken seriously. Shimura 
left Tokyo for Princeton in 1957, resolving to return in two years to continue 
work with his colleague. His plans were not realized: in November 1958 
Taniyama committed suicide. A letter left beside his body explained that 
he did not really know why he had decided on this action: simply that he 
was in a frame of mind where he had lost confidence in his future. He was 
due to be married within a month. A few weeks later his fiancée also took 
her own life. They had promised each other they would never be parted 
and she chose to follow him in death. 

Shimura reacted to this double tragedy by devoting his energies to 
understanding the relationship between elliptic curves and modular func- 
tions. Over the years he gathered so much supporting evidence that the 
Taniyama-Shimura—Weil Conjecture became more widely appreciated. It 
occupies a pivotal position between two different areas of mathematics. 
Both of these areas had been studied intensely, but had remained separate. 
If the conjecture were true, then unsolved problems in one area could be 
translated into the language and concepts of the other, and perhaps solved 
by the novel methods available there. 

In the 1960s and 1970s, hundreds of mathematical papers appeared 
which showed that if the Taniyama—Shimura—Weil Conjecture were true, 
then other—very important—results would follow. A whole mathematical 
industry was being built on a principle that still eluded proof. 
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12.5 Frey’s Elliptic Equation 


In the depths of the Black Forest in Germany, near the town of Ober- 
wolfach, is a retreat for mathematical researchers, where they can gather 
in a relaxed environment to share their ideas. In the summer of 1984 a 
group of number theorists assembled to discuss their latest ideas on elliptic 
equations. In a lecture at the meeting Gerhard Frey, from Saarbrucken, 
formulated an idea that forever changed the landscape in the search for a 
proof of Fermat’s Last Theorem. 

In common with almost all of the great breakthroughs in number the- 
ory, Frey’s idea depended on an ingenious calculation. Frey made the 
assumption that a genuine solution to Fermat’s equation did exist, in the 
form a” + 6” = c”, where a,b,c are integers and n > 2. The existence 
of such a solution would, of course, show that Fermat’s Last Theorem has 
a counterexample. He then wrote the following elliptic equation on the 


board: a(x +a")(x — 6) 
= 2° 4+ (a"—b")2? — abr 


which later became known as the Frey curve. He explained that he was 
interested in this equation because it has very special properties. For in- 
stance, the ‘discriminant’ of such an equation is defined to be 


(x1 — 22)?(x2 — 23)?(23 — 21)? 


where 21, 22,23 are the roots of the right hand side. In this case 21 = 
0,22 = —a”, x3 = b”, so the discriminant is 


(-—a” _ b”)?(b" _ 0)?(0 _ (—a)")? _— C2"h2% G2" 


(using a” + 6” = c”). Frey remarked that it is highly unusual for a dis- 
criminant to be a perfect power in this way, and went on to suggest that 
the equation has other equally strange properties which mean that it con- 
tradicts the Taniyama-Shimura—Weil Conjecture. He was unable to prove 
this in full, but he offered convincing evidence for such a connection. So, 
if the Taniyama—Shimura—Weil Conjecture is true, then there cannot be 
any solution of the Fermat equation ... so Fermat’s Last Theorem must 
be true. 


12.6 The Amateur who Became a Model Professional 


Andrew Wiles now enters the story. His love of mathematics dated from his 
childhood in Cambridge. As he recalled in the BBC Television Programme 
Horizon on 27 September 1997: 
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I was a 10-year old, and one day I happened to be looking in my local 
public library and I found a book on math and it told a bit about 
the history of this problem—that someone had resolved this problem 
300 years ago, but no one had ever seen the proof, no one knew if 
there was a proof, and people ever since have looked for the proof. 
And here was a problem that I, a 10-year-old, could understand, 
but none of the great mathematicians in the past had been able to 
resolve. And from that moment of course I just tried to solve it 
myself. It was such a challenge, such a beautiful problem. 


This problem was Fermat’s Last Theorem. It became an obsession. As 
a teenager, Wiles reasoned that Fermat would have had only limited re- 
sources, which did not include the more subtle theories that came after 
him. Wiles therefore felt that it was worthwhile to attack Fermat’s Last 
Theorem using only the knowledge that he already had from school. As his 
interest developed, though, he began to read the literature on the subject, 
and to delve more and more deeply into it. 

In 1971, he went to Merton College, Oxford, to study mathematics. 
After graduating in 1974 he moved to Clare College, Cambridge, to study 
for a doctorate. At the time he wanted to pursue his quest for a proof 
of Fermat’s Last Theorem, but his PhD supervisor John Coates advised 
him against this, because it was possible to spend many years working on 
the problem but getting nowhere. Instead, Wiles worked in his supervi- 
sor’s area of expertise which happened to be the Iwasawa theory of elliptic 
curves—a fortuitous choice, given how the story would later turn out. 

Wiles was a Junior Research Fellow at Clare College from 1977 to 1980, 
spending a period during this time at Harvard. In 1980 he was awarded his 
doctorate, and then he spent a time at Bonn before taking a post at the 
Princeton Institute for Advanced Study in 1981. He became a professor at 
Princeton University in 1982. He was awarded a Guggenheim Fellowship 
to visit the Institut des Hautes Etudes Scientifiques and the Ecole Normale 
Supérieure in Paris during 1985-86, and it was here that events occurred 
which were to change his life. 

In 1986, Ken Ribet completed a chain of arguments that began with the 
Frey curve and used ideas of Jean-Pierre Serre on modular Galois groups, 
to prove Frey’s contention that the Taniyama—Shimura—Weil Conjecture 
implies Fermat’s Last Theorem. Andrew Wiles took this as an opportunity 
to begin work in earnest. If he could prove the Taniyama—Shimura—Weil 
Conjecture, then he would finally crack the problem that had defeated the 
entire mathematical community for nearly three hundred and fifty years. 

He soon learned that so many people continued to have an interest in 
Fermat’s Last Theorem that talking about it would lead to wide-ranging 
discussions that would use up valuable time. So for the next seven years 
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he worked on the problem in secret. As he worked, only his wife, and later 
his young children and his Head of Department, were aware of what he was 
doing. He spent his life on his mathematics and with his family. When he 
got stuck in his thinking, he took a walk down the road to the lake near 
the Princeton Institute. Sometimes the combination of relaxation and deep 
incubation of ideas would suddenly come together in a new revelation. He 
found it necessary always to have a pencil and paper with him, to write 
down anything that occurred before it slipped his mind. 

In 1988 he was stunned to see an announcement in the Washington Post 
and the New York Times that Fermat’s Last Theorem had been proved by 
Yoichi Miyaoka of Tokyo University. Miyaoka had used a technique parallel 
to that of Wiles, by translating the number-theoretic problem into a dif- 
ferent mathematical theory—in this case, differential geometry. Miyaoka’s 
first outline of the proof was presented at a seminar in Bonn. Two weeks 
later he released a five-page algebraic proof, and close scrutiny by other 
mathematicians began. Soon his ‘theorem’ was seen to contradict a re- 
sult that had been proved conclusively several years before in geometry. A 
fortnight later, Gerd Faltings pinpointed the fatal weakness in Miyaoka’s 
proof. Within two months the consensus was that Miyaoka had failed: 
Wiles could breathe again and continue his work. 

In the next three years he made considerable progress with various parts 
of the proof. As he explained later, 

Perhaps I can best describe my experience of doing mathematics in 
terms of a journey through a dark unexplored mansion. You enter 
the first room of the mansion, and it’s completely dark. You stumble 
around bumping into the furniture, but gradually you learn where 
each piece of furniture is. Finally, after six months or so, you find 
the light switch, you turn it on, and suddenly it’s all illuminated. 
You can see exactly where you were. Then you move into the next 
room and spend another six months in the dark. So each of these 
breakthroughs, while sometimes they’re momentary, sometimes over 
a period of a day or so, they are the culmination of—and couldn’t 
exist without—the many months of stumbling around in the dark 
that preceded them. 

He tried to use the Iwasawa theory that he had studied for his Ph.D. 
He knew that the theory as it stood would be of little help, so he tried to 
generalize it and fix it up to attack his difficulties. It didn’t work. After 
a period of getting nowhere, in 1991 he met his supervisor John Coates at 
a conference, and Coates told him of something that appeared to bridge 
the gap. A brilliant young student, Mattheus Flach, had just written a 
beautiful paper analysing elliptic equations. Wiles took a look at the work 
and concluded that it was exactly what he needed. Progress thereafter was 
more rapid. 
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In 1993 Wiles gave a series of three lectures at the Isaac Newton Insti- 
tute in Cambridge, England, on Monday, Tuesday, and Wednesday 21-23 
June. The title of the series was ‘Modular forms, elliptic curves and Galois 
representations’. With typical modesty, he made no advance announce- 
ments of his recent activity. Even so, many of the giants of number theory 
realised that something special was about to happen, and they attended 
the lectures—some with cameras ready to record the event for posterity. In 
the course of his lectures, Wiles proved a partial version of the Taniyama— 
Shimura—Weil Conjecture. It was sufficiently powerful to have a very spe- 
cial corollary. At 10:30 am, at the end of his third lecture, he wrote this 
corollary on the blackboard. It was the statement of Fermat’s Last Theo- 
rem. At this point he turned to the audience, and as he sat down he said 
‘I will stop here.’ 


12.7 Technical Hitch 


Wiles was not allowed to stop there. His proof now had to be subject to 
the usual reviewing process, and soon doubts began to arise. In response 
to a query from a colleague, Nick Katz, he realised that there was a hole in 
his use of the Flach technique that he had employed in the final stages of 
his proof. On 4 December 1993 he made a statement that in the reviewing 
process a number of issues had arisen, most of which had been resolved. 
However, in view of the speculation buzzing around at the time, he acknowl- 
edged that a certain problem had occurred, and he wished to withdraw his 
claim that he had a proof. Despite this, he said that he remained confident 
that he could repair the difficulty using the methods he had announced in 
his Cambridge lectures. His life was suddenly in turmoil. Instead of being 
able to work in secret, his difficulties were now public knowledge. 

In view of the many false proofs of Fermat’s Last Theorem that had 
preceded Wiles’s announcement, his mathematical colleagues now began to 
voice doubts about the validity of his proof. In March 1994 Faltings wrote 
in Scientific American, saying: 


If it were easy, he would have solved it by now. Strictly speaking, it 
was not a proof when it was announced. 


In the same magazine, André Weil was even more damning: 


I believe he has had some good ideas in trying to construct the 
proof, but the proof is not there. To some extent, proving Fermat’s 
Theorem is like climbing Everest. If a man wants to climb Everest 
and falls short of it by 100 yards, he has not climbed Everest. 
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From the beginning of 1994, Wiles began to collaborate with his former 
student Richard Taylor in an attempt to fill the gaps in the proof. They 
concentrated on the step based on the method of Flach, which was now seen 
to be inadequate, but they were unable to find an alternative argument. 
In August, Wiles addressed the International Congress of Mathematicians 
and had to announce that he was no nearer to a solution. Taylor suggested 
that they revisit Flach’s method to see if another approach were possible, 
but Wiles was sure it would never work. Nevertheless, he agreed to give it 
another try to convince Taylor that it was hopeless. 


12.8 Flash of Inspiration 


They worked on alternative approaches for a couple of weeks, with no 
result. Then Wiles suddenly had a blinding inspiration as to why the 
Flach technique failed: 


In a flash I saw that the thing that stopped it working was something 
that would make another method I had tried previously work. 


His inspiration cut through the final difficulties. On 6 October he sent 
the new proof to three mathematicians primed for the job, and all three 
reviewers found the new ideas satisfactory. The new method was even 
simpler than his earlier failed attempt, and one of the three, Faltings, even 
suggested a further simplification of part of the argument. By the following 
year there was general agreement that the proof was acceptable. When 
Taylor lectured at the British Mathematical Colloquium in Edinburgh in 
April 1995, there were no longer any real doubts about the proof’s validity. 

The proof was finally published in May 1995 in two papers in Annals 
of Mathematics. The first, from page 443 to 551, was Wiles’s paper on 
‘Modular elliptic curves and Fermat’s Last Theorem’, the second, from page 
553 to 572, was the final step by Taylor and Wiles, entitled ‘Ring theoretic 
properties of Hecke algebras’. See Wiles [82], Taylor and Wiles [75], and 
the survey by Darmon et al. [18]. 

In the years that followed, Wiles was féted around the world. In 1995 
he received the Schock Prize in Mathematics from the Royal Swedish Acad- 
emy of Sciences and the Prix Fermat from the Université Paul Sabatier. 
The American Mathematical Society awarded him the Cole Prize in Num- 
ber Theory, worth $4,000. He was presented with a $50,000 share in the 
1995/6 Wolf Prize by the Israeli President Ezer Weizman for his ‘spectac- 
ular contributions to number theory and related fields, major advances on 
fundamental conjectures, and for settling Fermat’s Last Theorem’. (The 
other recipient, Robert Langlands, was honored for his own work in num- 
ber theory, automorphic forms, and group representations.) In 1996 Wiles 
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received the National Academy of Sciences Award ($5,000) followed in 1997 
by a five year MacArthur Fellowship ($275,000). On 27 June 1997, (after 
his proof had been published for the statutory two years laid down in the 
rules) he received the 75,000 marks awarded for the Wolfskehl Prize. He 
had solved the problem with more than a decade to go before the hundred- 
year period laid down in the original bequest had been exhausted. 

There is no Nobel Prize in mathematics: the equivalent honour is the 
Fields Medal, awarded to up to four mathematicians every four years at the 
International Congress of Mathematicians. But by tradition the medal is 
limited to individuals under the age of forty, and Wiles was just over this age 
when he proved Fermat’s Last Theorem. So in August 1998 the Congress 
celebrated this event at the Fields Medal Ceremony by awarding Wiles a 
special Silver Plaque—a unique honour in the history of the organization. 
In 1999 he won the King Faisal International Prize for Science ($200,000), 
being nominated for this honour by the London Mathematical Society. 
In addition he has been awarded honorary degrees at many universities 
around the world and, in the New Years Honours List in 2000, he became 
Sir Andrew Wiles, Knight Commander of the British Empire. The ten- 
year-old boy had grown to achieve his lifetime ambition, and had been 
lionized around the world for his success. He had conquered a problem 
that had foiled the world of mathematicians for 358 years. 


12.9 Exercises 


These exercises really are intended to be taken seriously. Don’t just be 
amused (or not, according to taste): do them. 


1. Think about how creative mathematics needs hard work (first) and 
relaxation. Note how the great insights mentioned in this chapter 
occurred. Does this suggest any way to help yourself understand 
difficult mathematics? 


2. Find a lake or other idyllic setting; relax and think great thoughts. 


3. Prepare yourself for the rigours to come. Subtler details will be out- 
lined in the next two chapters. 


13 


Elliptic Curves 


In this chapter we introduce the important notion of an ‘elliptic curve’. 
Elliptic curves are a natural class of plane curves that generalise the straight 
lines and conic sections studied in nearly all university mathematics courses 
{and many high school courses). However, the study of elliptic curves 
involves two new ingredients. First, it is useful to consider complex curves, 
not just real ones. Second, for some purposes it is more satisfactory to work 
in complex projective space rather than the complex ‘plane’ C?. (Algebraic 
geometers call C the complex line because it is 1-dimensional over C.) We 
shall introduce these refinements in simple stages. 
The main topics dicussed in this chapter are: 


e Lines and conic sections in the plane. 


e The ‘secant process’ on a conic section and its relation to Diophantine 
equations. 


e The definition and elementary properties of elliptic curves. 


e The ‘tangent/secant’ process on an elliptic curve and the associated 
group structure. 


Our point of view will be to emphasise analogies between conic sections, 
where the key ideas take on an especially familiar form, and elliptic curves. 
This should help to explain the origin of the ideas involved in the theory 
of elliptic curves, and make them appear more natural. 
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13.1 Review of Conics 


The simplest real plane curves are straight lines, which can be defined as 
the set of solutions (x,y) € R? to a linear (or degree 1) polynomial equation 


Az+By+C=0 (13.1) 


where A, B,C € R are constants and AB # 0. 
Next in order of complexity come the conic sections or conics, defined 
by a general quadratic (or degree 2) polynomial equation 


Az? + Bry + Cy? + Dz+Ey+F=0 (13.2) 


where A, B,C, D,E,F € R are constants and ABC + 0. 

It is well known (and can be found in most elementary texts on co- 
ordinate geometry or linear algebra) that conic sections can be classified 
into seven different types: ellipse, hyperbola, parabola, two distinct lines, 
one ‘double’ line, a point, or empty. A good way to see this is to trans- 
form Equation (13.2) into a simpler form, usually known as a normal form, 
by a change of coordinates. In fact, a general invertible linear change of 
coordinates 


x 
Y 


ax + by 
cz + dy 


(with ad — bc $ 0 for invertibility) transforms (13.2) into one or other of 
the forms 


eX? 4+ 6@Y2+P 
X74+YV4+Q = 


where P,Q € R and €), €2 = 0,1, or —1. 

The usual proof of this (see for example Loney [44] page 323, Anton [1] 
page 359, or Roe [62] page 251) begins by rotating coordinates orthogonally 
to diagonalise the quadratic form Az? + Bary + Cy”, which changes (13.2) 
to the slightly simpler form 


dz!” + oy” + aa! + By! +7 =0. 


If A; 4 0 then the term az’ can be eliminated by ‘completing the square’, 
and similarly if A2 4 0 then the term Gy’ can be eliminated. The coefficients 
of x’? and y’ ? can be scaled to 0,1, or —1 by multiplying them by a nonzero 
constant; furthermore, x’ and y’ can be interchanged if necessary. Finally, 
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the entire equation can be multiplied throughout by —1. The result is the 
following catalogue of normal forms: 


Theorem 13.1. By an invertible linear coordinate change, every conic can 
be put in one of the following normal forms: 


1. X?7+Y7+P=0 
2, X2-Y?+P=0 
3. X?74+Y+Q=0 
4. X74+Q=0 oO 


In case (1) we get an ellipse (indeed a circle) if P < 0, a point if P= 0, 
and the empty set if P > 0. In case (2) we get a (rectangular) hyperbola if 
P #0 and two distinct intersecting lines if P = 0. Case (3) is a parabola. 
Case (4) is a pair of parallel lines if Q < 0, a ‘double’ line if Q = 0, and 
empty if Q > 0. 

Transforming back into the original (x,y) coordinates, circles trans- 
form into ellipses, rectangular hyperbolas transform into general hyper- 
bolas, parabolas transform into parabolas, lines transform into lines, and 
points transform into points. 

Even the conics, then, exhibit a rich set of possibilities when viewed 
as curves in the real plane R?. The situation simplifies somewhat if we 
consider the same equations, but in complex variables; it simplifies even 
more if we work in projective space. In complex coordinates, the map 
Y » iY sends Y? to —Y?, a scaling that cannot be performed over the 
reals. This coordinate transformation sends normal form (1) to normal 
form (2) and thereby abolishes the distinction between hyperbolas and 
ellipses. 


13.2 Projective Space 


We now show that in projective space, all the different types of conic section 
other than the double line and the point can be transformed into each 
other. (This is the case even in real projective space.) First, we recall the 
basic notions of projective geometry. For further details, see Coxeter [16], 
Loney [44], or Roe [62]. 


Definition 13.2. The real projective plane RP? is the set of lines L through 
the origin in R?. Each such line is referred to as a projective point. Each 
plane through the origin in R? is called a projective line. A projective point 
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is contained in a projective line if and only if the corresponding line through 
the origin is contained in the corresponding plane through the origin. 
A projective transformation or projection is a map from RP? to itself of 
the form L ++ ¢(L), where ¢ is an invertible linear transformation of R?. 
Two configurations of projective lines and projective points are projec- 
tively equivalent if one can be mapped to the other by a projection. 


This definition may seem strange when first encountered, but it repre- 
sents the distillation of a considerable effort on the part of geometers to 
‘complete’ the ordinary (or affine) plane R? by adding ‘points at infinity’ 
at which parallel lines can be deemed to meet. We explain this idea in a 
moment, but first we record: 


Proposition 13.3. In the projective plane, any two projective lines meet in 
a unique projective point, and any two projective points can be joined by a 
unique projective line. 


Proof: These properties follow from the analogous properties of lines and 
planes through the origin in R°. O 


Now we describe the interpretation of the projective plane in terms of 
points at infinity. One way to see how this comes about is to consider 
the plane P = {(x,y,z) : z = 1} C R*. Each point (z,y,1) € P can be 
identified with a point (x,y) in the affine plane R?. We write (z,y) = 
(x,y,1). Alternatively, the point (x,y,1) can be identified with the line 
through the origin in R? that passes through it. Nearly every line through 
the origin in R? is of this form: the exceptions are precisely the lines that 
lie in the plane Q = {(z,y, z) : z = 0}, that is, the lines parallel to P. 

In the same way, any straight line M in P can be identified with either 
a straight line in R?’, or with a plane through the origin in R®? — namely, 
the unique plane that contains both the origin of R? and M. Precisely one 
plane through the origin of R? is not of this form, namely, the plane Q 
that is parallel to P. (See Figure 13.1.) 

These identifications therefore embed the ‘affine’ plane R? in the pro- 
jective plane RP?, in such a way that points embed as projective points 
and lines embed as projective lines. However, RP? contains exactly one 
extra projective line, called the line at infinity, namely the projective line 
that corresponds to the plane Q through the origin of R?. Moreover, RP? 
contains extra projective points that do not correspond to points in R?; 
indeed, these are precisely the projective points that lie on the line at in- 
finity, since they correspond to lines through the origin in R® that lie in 
the plane Q. These are called ‘points at infinity’. 
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y 
Figure 13.1. Construction of the projective plane. 


Each point at infinity corresponds to a unique ‘direction’ in the plane 
R?”, that is, a set of parallel lines. This follows since any such set is parallel 
to precisely one line in the plane Q. Note that in these terms a ‘direction’ 
and its exact opposite, a 180° rotation, are deemed to be identical. (See 
Figure 13.2.) 


The key feature of this set-up is: 


Lemma 13.4. Any two parallel lines in R? meet in RP? at exactly one 
point at infinity. 


+ _ point at 


Figure 13.2. Points at infinity correspond to directions in the affine plane. 
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Proof: Suppose J, K are parallel lines in R?. They correspond to projec- 
tive lines, namely, the planes J’, K’ through the origin that contain J, K 
respectively. But these meet in a unique projective point, namely the line 
in Q that is parallel to J and K. And this is a point at infinity in RP?. 
Oo 


There is also a complex analogue: 


Definition 13.5. The complex projective plane CP? is the set of lines (that 
is, 1-dimensional vector subspaces over C) through the origin in Cc’. 
Each such line is referred to as a complex projective point. 
Each plane through the origin in R? is called a complex projective line. 


For notational convenience and simplicity we will break with tradition 
and use z, y to denote complex variables, when convenient. The convention 
that z = x + iy will be abandoned in this chapter and the next. 

In complex projective space it is also the case that any two projective 
lines meet in a unique projective point, and any two projective points can 
be joined by a unique projective line. 

The geometry of projective space, real or complex, is richer than mere 
lines. Any curve in R? or C? can be embedded in the corresponding RP? 
or CP”. If the equation of the curve is polynomial, then this can be done 
in a systematic manner, so that ‘points at infinity’ on the curve can also 
be defined. The easiest way to achieve this is to introduce ‘homogeneous 
coordinates’. Again, the idea is straightforward. A point in RP? is a line 
through the origin in R?. Any nonzero point (X,Y, Z) on that line defines 
the line uniquely. So we can use (X,Y, Z) as a system of coordinates. 
However, this system has two features that distinguish it from Cartesian 
coordinates. The first is that all values of X,Y,Z are permitted except 
(X,Y, Z) = (0,0,0). The reason is that there is no unique line joining 
(0,0,0) to the origin in R®. The second is that (aX,aY,aZ) represents 
that same projective point as (X,Y, Z) for any nonzero constant a, since 
clearly both points define the same line through the origin of R®?. We 
therefore define an equivalence relation ~ by (X,Y, Z) ~ (aX, aY,aZ) for 
any nonzero constant a. In other words, it is not the values of (X,Y, Z) 
that determine the corresponding projective point, but their ratios. 

The embedding of R? into RP? defined above, which identifies (z, y) € 
R? with (z,y,1) € P C R’, also identifies the usual coordinates (z,y) on 
R? with the corresponding coordinates (z,y,1) on RP?. This represents 
the same projective point as (az, ay, a) for any a # 0. In other words, when 
Z #0 the projective point (X,Y, Z) is the same as (X/Z,Y/Z,1) € P. On 
the other hand, when Z = 0 the projective point (X, Y, Z) lies in the plane 
Q and hence represents a point at infinity. 
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This system of coordinates (X,Y, Z) on RP? is known as homogeneous 
coordinates. Homogeneous coordinates are really ~-equivalence classes of 
triples (X,Y, Z), but it is more convenient to work with representatives and 
remember to take the equivalence relation ~ into account. The choice of the 
line at infinity is conventional: in principle any line in RP” can be deemed 
to be the line at infinity, and there is then a corresponding embedding of 
R? in RP’. Indeed, any projective line in RP? can be mapped to any 
other projective line by a projection, since any plane through the origin in 
R® can be transformed into any other plane by an invertible linear map. 
For the purposes of this book, however, we employ the convention that 
Z = 0 defines the line at infinity. 

The way to transform a polynomial equation in affine coordinates (x, y) 
into homogeneous coordinates is to replace x by X/Z and y by Y/Z, and 
then to multiply through by the smallest power of Z that makes the result 
a polynomial. For example the Cartesian equation y — x? = 0 becomes: 


(Y/Z) —(X/ZP = 0, 
YZ"'-xX?z? = 9, 
YZ—X* = 0. 


Notice that as well as the usual points (x, 2?) = (z, x?,1), this projective 
curve also contains the point at infinity given by Z = 0, which forces X = 0 
but any nonzero Y. Since (0, Y,0) ~ (0,1, 0) the parabola contains exactly 
one new point at infinity, in addition to the usual points in R?. It is easy 
to check that this point lies in the direction towards which the arms of the 
affine parabola ‘diverge’, namely the y-axis. (See Figure 13.3.) 

Moreover, adding this point at infinity to the parabola causes it to close 
up (since the point at infinity lies on both arms). It is now plausible that the 


point at 
infinity 
a 


Figure 13.3. Adding a point at infinity to a parabola. 
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parabola is just an ellipse in disguise—that is, that they are projectively 
equivalent. We can verify this by means of the projection ¢(X,Y,Z) = 
(X,Y+Z,Y — Z), which transforms YZ — X* = 0 into Y? — Z? — X =0 
or X* 4+ Z? = Y?. Compose with u(X,Y,Z) = (X,Z,Y) to turn this 
into X24 Y?2 = Z?. Finally restrict back to the plane R? by setting 
(X,Y, Z) = (z,y,1) and we get x? + y? = 1, a circle. Which, of course, is 
just a special type of ellipse. 

Note that if we had not interchanged Y and Z, the result would have 
been x? — y* = 1, a hyperbola. So in fact the ellipse, hyperbola, and 
parabola are all projectively equivalent over R. 

In real projective space, then, the list of conics collapses to a smaller 
one. Namely: ellipse (= parabola = hyperbola), intersecting lines (= pair 
of parallel lines), double-line, point. 

What about complex projective space? Think about it. Hint: the real 
surprise is ‘point’. 


13.3 Rational Conics and the Pythagorean Equation 


There is an interesting link between the geometry of conics and solutions 
of quadratic Diophantine equations. 


Definition 13.6. A rational line in R? or C? is a line 
az + by+c=0 (13.3) 


whose coefficients a, b,c are rational numbers. 
A rational conic in R? or C? is a conic 


f(x,y) = ax? + bay + cy? + dx+ey+ f =0 (13.4) 


whose coefficients a, b,c,d,e, f are rational numbers. 
A rational point in R? or C? is a point whose coordinates are rational 
numbers. 


There are similar definitions for real and complex projective planes. 

An intersection point of two rational lines is obviously a rational point. 
However, an intersection point of a rational line and a rational conic need 
not be rational—for example, consider the intersection of x — y = 0 with 
x? + y* — 2 =0, which consists of the two points (+V2,+V2). 

Not all rational conics possess rational points. For an example see 
Exercise 1 at the end of this chapter. A necessary and sufficient condition 
for a rational conic to possess at least one rational point was proved by 
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Legendre, and can be found in Goldman [31] page 318. However, many 
rational conics do possess rational points, and from now on we work with 
such a conic. 


Proposition 13.7. Let p be a rational point on a rational conic C. Then 
any rational line through p intersects C' in rational points. 


Proof: We discuss real conics: the complex case is similar. Let f(z, y) be 
a rational conic as in (13.4) and let Ax + By+C = 0 be a rational line. 
Suppose that B # 0; if not, then A ¢ 0 and a similar argument applies. 
Their intersection is the set of all (z,y) for which y = (Az — C)/B and 
x satisfies the quadratic equation f(z, (Ax — C)/B) = 0. Suppose that 
f(x, (Az — C)/B) = Kz? + Lx +M: then K,L,M are rational. This 
equation has at least one real root, given by p, so it has two real roots 
(which are identical if and only if the line is tangent to the conic at p). 
The sum of those roots is —L/K € Q, and the root given by p is rational; 
therefore the second root is also rational. O 


This result immediately leads to a method for parametrizing all rational 
points on a rational conic (provided it possesses at least one rational point). 
Let C' be a rational conic with a rational point p and let L be any rational 
line. For any point gq € L the line joining g to p meets C at p, and at some 
other point (which is distinct from p unless the line concerned is tangent 
to C). Define a map 7: L > C by letting m(q) be this second point of 
intersection of the line joining q to p. (See Figure 13.4.) 


q 


Figure 13.4. Parametrization of the rational points of a rational conic in terms of 
the rational points on a rational line. 
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Theorem 13.8. With the above notation, x(q) is rational if and only if q 
ts rational. 


Proof: Clearly (gq) is rational if and only if the line joining p to q is 
rational. But this is the case if and only if g is a rational point. O 


Example 13.9. Suppose that C is the unit circle x? + y? — 1 = 0, which is 
a rational conic. It contains the rational point p = (—1,0). Let L be the 
rational line = 0. The rational points on LD are the points (0,¢) where 
teq. 

The line joining p to (0,t) has equation 


y=t(x+1) 
and this meets the circle at 
1-t 2t 
(x,y) = Coreere T+p” 


Thus we have the identity 


1-¢#? 
1+# 


2t 


( as ere a 


or equivalently 


(1 — t)? + (2¢)? = (1 +1)? 


providing solutions of the Diophantine equation 2? + y? = z? in rational 


numbers. Indeed by Theorem 13.8 every rational solution is of this form. 

This is very close to the parametrization of Pythagorean triples obtained 
in Lemma 11.1. Indeed, if we set ¢ = r/s we can easily obtain the result of 
that lemma. 


13.4 Elliptic Curves 
Elliptic curves arise from the study of plane cubic curves 
>a Aiyz'y? (13.5) 
itjs3 


where the A;; are constants and A39A21A12Apo3 ¥ 0. 
Over the reals, such curves were classified by Newton in (probably) 
1668: he distinguished 58 different kinds. See Westfall [81] page 200. As in 
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oA 


Figure 13.5. Typical singular points: (left) self-intersection, (right) cusp. 


the case of conics, the key to such a classification is to transform coordinates 
so that (13.5) takes some simpler form. There are several ways to do this. 
In order to state the first, we need two definitions: 


Definition 13.10. Let C be a curve in the plane (real or complex, affine 
or projective). A point x € C is regular if there is a unique tangent to C 
at x. Otherwise, x is singular. 

The curve C is non-singular if it has no singular points; that is, every 
point x € C is regular. 


Typical singular points are self-intersections and cusps as in Figure 13.5. 
Although it looks as if the tangent at a cusp point is unique, actually there 
are three ‘coincident’ tangents—that is, a tangent of multiplicity 3. 


Definition 13.11. Two curves C,D € CP? (or RP”) are projectively 
equivalent if there is a projection ¢@ such that ¢(C) = D. 


Theorem 13.12. (Weierstrass Normal Form.) Every nonsingular cubic 
curve in CP? is projectively equivalent to a curve which in affine coordi- 
nates takes the Weierstrass normal form 


2 — Aa — goa — gs (13.6) 
where go, 93 € C are constants. 


Proof: We sketch the proof. The first step is to establish that every 
nonsingular cubic curve C’ has at least one inflezion point. This is a point 
at which the tangent line has triple contact with the curve, in the following 
sense. Suppose that the equation of the curve is f(x,y) = 0, and let 
(€,n) € C. A general line through (€, 7) has equation a(z—£)+b(y—n) =0 
for a,b, € C. This line meets C at (€,7), and in general (since the equation 
is cubic) it meets it at two other points. However, the cubic equation that 
determines these intersection points may have multiple zeros. The line is 
a tangent at (€,7) if that point corresponds to a double zero. The point 
(€,7) is an inflexion if it corresponds to a triple zero. 
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By writing down the equations for an inflexion point, it can be shown 
that any nonsigular cubic curve in CP? has exactly nine inflexion points, 
if multiplicities are taken into account. In particular, it has at least one. 
See Brieskorn and Knérrer [8] page 291 for details. 

By a projection, we may assume that (0,0, 1) is an inflexion point and 
that the tangent there has equation X = 0. This implies that the cubic 
curve has a homogeneous equation of the form 


Y?Z4+ AXYZ+4+ BYZ?4+CX34 DX?Z4 EXZ7?4+FZ°=0 
which in affine coordinates becomes 
y’ + (Ax + B)y + g(x) =0 (13.7) 


where g(a) is a cubic polynomial. Define new affine coordinates (z’, y’) by 


Then (13.7) transforms into 
y +h(z’) =0 


where h is a cubic polynomial. There exists a linear change of coordinates 
x" = pz’ + q that puts h(x’) into the form 42” 3 _ goa" — gs while leaving 
y’ unchanged. O 


The coefficient 4 on the x? term in Weierstrass normal form is tradi- 
tional: it could be made equal to 1, but we will shortly see that the 4 is 
more convenient in some circumstances. The notation go, 93 for the linear 
and constant coefficients is also traditional. 

We may now define an elliptic curve. 


Definition 13.13. An elliptic curve is the set of points (x,y) € C? that 
satisfy the equation 


y? = Ax® + Ba? +Cz+D (13.8) 


where A,B,C,D€C are constants. 


Strictly speaking, this defines a complex affine elliptic curve. There is 
projective analogue Y?Z = AX?+ BX?Z4+CXZ* + DZ’, where (X,Y, Z) 
are homogeneous coordinates on CP”. When A, B,C,D € R we can re- 
strict attention to real variables, getting a real elliptic curve. Moreover, 
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Figure 13.6. The real elliptic curve y? = 42° — 3a + 2. 


in the real case we can draw the graph of (13.8) in the plane to illustrate 
certain features of the geometry: we do this frequently below. Figure 13.6 
shows the real elliptic curve y? = 4a — 32 + 2 for which go = 3, 93 = —2. 

The most important elliptic curves are those for which A, B,C, D are 
rational. We shall call these rational elliptic curves, and omit ‘rational’ 
whenever the context permits. 


13.5 The Tangent/Secant Process 


In Section 13.1 we showed that the rational points on a rational conic can 
be parametrized by the rational points on a rational line, once we know 
one rational point on the conic. A similar approach to rational points on a 
rational elliptic curve does not lead to such a definitive result, but in some 
respects the partial result that is thereby obtained is more interesting. 


Proposition 13.14. Over CP? a rational line cuts a rational elliptic curve 
in three points (counting multiplicities). If two of these points are rational, 
then so is the third. 


Proof: Suppose that the affine equation of the elliptic curve is f(x,y) = 0 
and let the line have affine equation az + by +c = 0. Without loss of 
generality assume that b # 0, and solve for y, to get y = —(ax4+c)/b. 
Substitute in f to get f(z,—(ax + c)/b) = 0. This is a cubic polynomial 
pz? + qx? + rx +s with rational coefficients, and its roots determine the 
z-coordinates of the intersections points. The corresponding y-coordinates 
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Figure 13.7. Constructing new rational points on an elliptic curve. 


are equal to —(axz + c)/b, hence are rational if and only if x is rational. 
Since the sum of the roots of the cubic is equal to the rational number 
—p/q, if two roots are rational then so is the third. O 


Incidentally, we stated Proposition 13.14 in projective form because in 
the affine case there are occasions when the third root of the cubic is at 
infinity. That is, the cubic actually reduces to a quadratic. We slid over 
this point in the proof. The proposition implies that once we have found 
two rational points on a rational elliptic curve, we can find another by 
drawing the line through those points and seeing where else it cuts the 
curve, as in Figure 13.7 (left). In fact, we can do slightly better: find one 
rational point and see where else the tangent at that point cuts the curve, 
Figure 13.7 (right). 


13.6 Group Structure on an Elliptic Curve 


We now show that the rational points on a rational elliptic curve form an 
abelian group, under an operation of ‘addition’ closely related to Proposi- 
tion 13.14. This remarkable fact forms the basis of the arithmetical theory 
of elliptic curves. 

Assume that an elliptic curve C in CP? contains at least one rational 
point, which we denote by O for reasons soon to become apparent. 


Definition 13.15. Let P and Q be rational points on C. Define P « Q to 
be the third point in which the line through P and Q meets C. 

Let G be the set of all rational points in C’. For some fixed but arbitrary 
choice O of a rational point on C’, define the operation + on G by 


P+Q=(P*Q)*O (13.9) 
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* 


p+q 


Figure 13.8. The group operation on the rational points of an elliptic curve. 


(See Figure 13.8.) We now prove that the operation + gives G the 
structure of an abelian group. 

In order to achieve this, we require the following fundamental theorem 
in algebraic geometry. 


Theorem 13.16. (Bézout’s Theorem.) Let P(X,Y,Z) be a homogeneous 
polynomial of degree p over C, let Q(X,Y, Z) be a homogeneous polynomial 
of degree q over C, and suppose that P,Q have no common factor of degree 
>1. Then the number of intersection points of the curves in CP” defined 
by P = 0,Q = 0 is precisely pq (provided multiplicities are taken into 
account). 


Proof: A detailed proof, along with a careful discussion of how to count 
multiplicities, can be found in Brieskorn and Knérrer [8] page 227. oO 
Next we state without proof a lemma from algebraic geometry: 


Lemma 13.17. Let two curves of degree n meet in exactly n? distinct 
points, and let0 <m <n. If exactly mn of these points lie on an irreducible 
curve of degree mn, then the remaining n(n — m) lie on a curve of degree 
n—-m., 


Proof: See Brieskorn and Knéorrer [8] page 245. oO 


We may now prove: 
Theorem 13.18. The set G of rational points on a rational elliptic curve 


forms an abelian group under the operation +. The identity element is O. 


Proof: First, observe that P «Q = QP, since the process of constructing 
the third point on the line through P and Q does not depend on the order 
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in which we consider P and Q. So 
P+Q=(P*Q)*«O=(Q*P)x*O=Q4+P 


and the operation + is commutative. 
We claim that P + O = P, so O is the identity element. This follows 
since 


P+O=(P*0O)+*O. 


If Q = P xO, then P, O, Q are collinear. Assume for a moment that these 
are distinct. Then (P * O) *O = Q * O, and this must be P. If they are 
not distinct, then either P = O or P = Q, and in either case it is easy to 
complete the calculation. 

The inverse of P is easily seen to be P * (O * O). 

The most complicated part of the proof is the associative law 


(P+Q)+R=P+(Q+R). 


Figure 13.9 indicates the associated geometry. 
First, we define 


S = P+Q, 
T = S+R, 
U = Q+R. 


We have to prove that P+ U = T. Denote the auxiliary points used to 
construct R,S,T by R’,S’,T’, as in the figure. It suffices to show that 
P,U,T" are collinear. To do so, let LZ, be the line through P, Q, S$’, let Lo 
be the line through S, R,T’, let D3 be the line through O,U’,U, let G; be 
the line through O, S’, S, and let G2 be the line through Q, R, U’. 

Recall from Lemma 13.17 that if two curves C,D of order n meet in 
exactly n? points, nm of which lie on an irreducible curve E of order m, 
then the remainder lie on a curve F of order n — m. 

Apply this to the cubic curves C, D, where D = L, UL, U Zs, and take 
E =Gy). First, suppose CN D contains exactly 9 distinct points. Then it 
follows that Q, R,U'’,P,T’,U lie on a conic. But Q, R,U’ lie on the line 
G2, so Gg is a component of this conic. Let the other component, also a 
line, be G3. Then if P,T’,U do not lie on Go, they must lie on G3, and 
the proof is complete. 

If CN D contains less than 9 distinct points, then some intersection 
points are multiple. A suitable small perturbation of the curve C splits 
these apart, and a limiting argument completes the proof. For a more 
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Figure 13.9. Geometry for the proof of the associative law on G. 


algebro-geometric approach, replacing the limiting procedure by Zariski 
continuity, see Brieskorn and Knérrer [8] page 310. U 


Remark. The operation + can be defined in exactly the same way for 
any elliptic curve, rational or not. The same proof shows that (C,+) is an 
abelian group. When C is rational, G is a subgroup. 


One of the most important thoerems in this area is: 


Theorem 13.19. (Mordell’s Theorem.) Suppose that C is a nonsingular 
rational cubic curve in CP? having a rational point. Then the group G of 
rational points is finitely generated. 


Proof: The original proof is due to Mordell [53]. A sketch, based on a 
version due to Weil [79], is described in Goldman [31]. The main idea is 
to define a function H(x), for x € G, that measures the ‘complexity’ of z, 
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and use H as the basis of an inductive argument. This function has the 
following properties: 


e For any K > 0 the set {P € G: H(P) < K} is finite. 


e For each Q € G there exists a constant c depending only on Q such 
that H(P + Q) < c(H(P))?. 


e There exists a constant d such that H(P) < d(H(2P))'/4. 
e The quotient group G/2G is finite. 


In fact, if P = (x,y) and z = m/n in lowest terms, we take H(P) = 
max(|ml, |n|). Oo 


Recall from Proposition 1.18 that a finitely generated abelian group is 
of the form 
Foz 


where F is a finite abelian group, hence a direct sum of finite cyclic groups. 
The group F’, which is unique, consists of the elements of finite order, and 
is called the torsion subgroup. The groups G determined by elliptic curves 
are very special, as is shown by the following theorem of Mazur: 


Theorem 13.20. Let G be the group of rational points on an elliptic curve. 
Then the torsion subgroup of G is isomorphic either to Z; where 1 <1 < 10, 
or Zo ® Zy where 1 <1< 4. 


Proof: The proof is very technical: see Mazur [48, 49]. O 


13.7 Applications fo Diophantine Equations 


We now describe an application of the above ideas to an equation very 
similar to Fermat’s. This application is due to Elkies [23]. 

We know that it is impossible for two cubes to sum to a cube, but might 
it be possible for three cubes to sum to a cube? It is; in fact 33+43+5? = 63. 
Euler conjectured that in general n nth powers can sum to an nth power, 
but not n — 1. It has been proved that Euler’s conjecture is false. In 1966 
L. J. Lander and T. R. Parkin [42] found the first counterexample to Euler’s 
conjecture: four fifth powers whose sum is a fifth power. In fact 


275 + 345 + 110° + 133° = 144°. 
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As a check: 
27° = 14348907 
845 = 4182119424 
110° = 16105100000 (13.10) 
133° = 41615795893 
1445 = 61917364224. 


They found this example by exhaustive computer search. 
In 1988 Noam Elkies found another counterexample by applying the 
theory of elliptic curves: three fourth powers whose sum is a fourth power. 


26824404 = 51774995082902409832960000 
153656394 = 55744561387133523724209779041 
18796760* = 124833740909952854954805760000 (i311) 
206156734 = 180630077292169281088848499041 


Instead of looking for integer solutions to the equation x4 + y* + z* = w4, 


Elkies divided out by w* and looked at the surface r+ + s* + #4 = 1 in 
coordinates (r,s,¢). An integer solution to z+ + y+ +z4 = w4 leads to a 
rational solution r = z/w,s = y/w,z =t/w of r++s++t* = 1. Conversely, 
given a rational solution of r++s*+¢* = 1, we can assume that 1, s, ¢ all have 
the same denominator w by putting them over a common denominator, and 
that leads directly to a solution to z+ y*+2+ = wt. Demjanenko [19] had 
found a rather complicated condition for a rational point (r,s,t) to lie on 
the closely related surface r* + s+ + ¢? = 1. Namely, such a rational point 
exists if and only if there exist x, y,u such that 


r= «x+y 
8s = @-y 
(u?+2)y? = —(3u? — 8u+t 6)a? — 2(u? — 2)x —2u 
(u2+2)t = 4(u? —2)2? + 8ur + (2—u?) 


To solve Elkies’s problem it is enough to show that ¢ can be made a square. 
A series of simplifications shows that this can be done provided the equation 


Y? = —31790X4 + 36941 X? — 56158.X7 + 28849X + 22030 


has a rational solution. This equation defines an elliptic curve. (Despite 
the presence of a fourth power on the right hand side, it can be transformed 
into a cubic. A similar transformation can be found in Section 14.2. See 
also McKean and Moll [52] page 254.) Conditions are known under which 
no solution can exist, but these conditions did not hold in this case, which 
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showed that such a solution might possibly exist. At this stage Elkies tried 
a computer search, and found the solution 


31 


~ 467 
30731278 


4672 


From this he deduced the rational solution 


18796760 


~ 20615673’ 
2682440 


20615673’ 
2682440 


20615673 ° 


This led directly to the counterexample to Euler’s conjecture for fourth 
powers, namely 


26824404 + 153656394 + 187960* = 206156737. 


In fact, there are infinitely many solutions. The theory of elliptic curves 
provides a general procedure for constructing new rational points from 
old ones—the tangent/secant construction. Using a version of this, Elkies 
proved that infinitely many rational solutions exist. In fact he proved that 
rational points are dense on the surface r+ + s* + t* = 1, that is, any patch 
of the surface, however small, must contain a rational point. The second 
solution generated by the tangent/secant construction is 


1439965710648954492268506771833175267850201426615300442218292336336633, 
4417264698994538496943597489754952845854672497179047898864124209346920, 
903396 4577482532388059482429398457291004947925005743028147465732645880, 
= 91617818300354368478324523982672660382270029622572436620703708887 22169. 


Exes 
ll 


After Elkies had discovered there was a solution, Roger Frye of the Thinking 
Machines Corporation did an exhaustive computer search. He found a 
smaller solution, indeed the smallest possible solution: 


958004 = 84229075969600000000 
2175194 = 2238663363846304960321 


414560* = 29535857400192040960000 
4224817 = 31858749840007945920321. 


(13.12) 
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13.8 Exercises 


1. Prove that the rational conic x? + y* — 3 = 0 contains no rational 
points. 


[Hint: Rational solutions correspond to integer solutions of the equa- 
tion X2+Y? = 3Z?. Without loss of generality X,Y and Z have no 
common factor > 1. Now consider the equation mod 3.| 


2. Consider a cubic curve C' in CP? whose equation is 
X34 y34 73 -AXYZ=0 (13.13) 


where A® = 27. Show that C is singular. 


[Hint: When A = 3 there is a factorization X° + Y3 + Z3 —3XYZ = 
(X+Y+4Z)(X?+Y?74 Z7 XY —YZ-— ZX). Draw the real affine 
versions of the two curves X+ Y+Z=Oand X?+Y?2+4+Z?—-xXY— 
YZ—ZX =0] 
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Elliptic Functions 


So far our discussion has been algebraic. We now introduce methods from 
complex analysis, which lead up to the Taniyama-Shimura—Weil Conjec- 
ture which forms the centrepiece of Wiles’s approach to—and proof of— 
Fermat’s Last Theorem. 

The main topics dicussed in this chapter are: 


e Trigonometric functions as a link between conic sections, Diophantine 
equations, and complex analysis. 


Weierstrassian elliptic functions and their connection with elliptic 
curves. 


Elliptic modular functions and their connection with elliptic curves. 


e The Frey elliptic curve, which links Fermat’s Last Theorem to elliptic 
curves. 


e The Taniyama-Shimura—Weil Conjecture. 


An excellent reference for the material covered in this chapter, and many 
related topics, is McKean and Moll [52]. See also King [41] for the basic 
material, plus some fascinating connections with the solution of polynomial 
equations. 


14.1 Trigonometry Meets Diophantus 


In this section we explore a rich area of interconnections between trigono- 
metric functions, complex analysis, algebraic geometry, and the Pythagorean 
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equation 
Xt+y* = 7". (14.1) 


We take the point of view that we do not yet have the machinery of trigono- 
metric functions available, and show how to derive these functions from the 
aforementioned interconnections. This is a useful ‘dry run’ for subsequent 
generalisations to elliptic functions. Of course it helps to bear the standard 
trigonometric functions in mind throughout, since then the manipulations 
we perform make more sense. 


14.1.1 An Approach to Trigonometric Functions 


We can consider (14.1) as the projective form of the affine equation 
ety =1 (14.2) 


by setting « = X/Z,y = Y/Z. Clearly integer or rational solutions of 
(14.1) correspond to rational solutions of (14.2). Solving (14.2) for y we 
get y = +V/1 — 2”, which suggests looking at the integral 

S(z) = (14.3) 


In order to evaluate this integral we assume that there exists a function 
s(u), with derivative c(u) = ds/du, such that 
c?(u) + 8?(u) =1. (14.4) 
To define these functions uniquely we impose ‘initial conditions’ s(0) = 
0, ¢(0) = 1. 
Given such a function, we can evaluate (14.3) by substituting 2 = s(u). 
Then dz = c(u)du, so 
i dc ‘i c(u) = 
V1l—2? / 1 — 8?(u) 


du =u. 


Therefore 


dz 4 
toa =s “(z). (14.5) 


Turning all this round, we can use (14.5) as the definition of the function 
s, by means of the equation 


s(u) dz 


| Jean? (14.6) 


14.1. Trigonometry Meets Diophantus 237 


and we deduce that s and its derivative c satisfy (14.4), together with the 
initial conditions. Similar arguments (Exercise 1 at the end of this chapter) 


show that 
s(—u) = —s(u), ¢e(—u) =c(u) 


for all u for which s(u), c(u) are defined. 

The abstract theory of the Riemann integral guarantees that s(u) is 
defined for u in some neighbourhood of 0. We can extend the definitions 
of s,c to the whole of R by making use of some of their special properties. 
So we now deduce the standard properties of trigonometric functions from 
our definition. Differentiate (14.4) to get 


2s(u)c(u) + 2c(u)e’(u) = 0 
proving that 
= atu) (14.7) 
du = ; 
We are now in a position to derive the standard power series for sine and 


cosine by invoking Taylor’s Theorem. By induction, successive derivatives 
of s,c at the origin are given by 


0 if nm = 0 (mod k) 


(n) (9) — 1 if n=1 (mod k) 
eu) 0 if n =2 (mod k) 
—1 if n=3 (mod k) 
and 
1 if n =0 (mod k) 
(n) 7 0ifn=1 (mod k) 
OOO 25 i n= 0 God's) 
0 if n= 3 (mod k) 
so that 
get 
2506 Lye aa ! , 
n=0 _ 1)! (14.8) 


n=0 


These series are absolutely convergent for all z € R and therefore define 
s,c on the whole of R. Indeed, we can replace z € R by z € C and extend 
the definitions to the complex plane: now s,c are complex analytic. The 


238 14. Elliptic Functions 


identity (14.4) now holds when z is replaced by any complex z, because 
two power series that are equal on an open set of real values x are equal 
throughout C. At this stage we are entitled to replace s by sin and c by 
cos, but to emphasise the logical line of development we continue to use 
the notation s,c. 


14.1.2 Addition Formulas and Parametrization of the Circle 


We next seek formulas for s(u + v) and c(u+v). Working backwards 
from (14.7) we see that Equation (14.4) is equivalent to X(U) = s(u) 
and X(u) = c(u) being independent solutions of the second order linear 
differential equation 


aX 
Te +X =0. (14.9) 


The general solution of (14.9), with arbitrary initial conditions, is 
X(u) = As(u) + Be(u) 


for constants A,B. Let v be any constant. Clearly X(u) = s(u+v) isa 
solution of (14.9), so 


s(u+v) = As(u) + Bc(u) (14.10) 
and by differentiation with respect to u we also have 
c(u+v) = Ac(u) — Bs(u) (14.11) 


where the constants A,b may depend on v. Letting u = 0 we see that 
A =-—c(v), B = s(v). Thus we have the addition formulas 


s(u+v) = s(u)c(v) + c(u)s(v), 
c(u+v) = c(u)c(v) — s(u)s(v). 


We are now ready to return to the Pythagorean equation (14.2), which 
defines a curve in R?, the unit circle S'. By Equation (14.4) the point 
(c(u), s(u)) lies in S* for any u € R. That is, there is a map 


Q:R— s* 
wt (c(u), s(u)). 


This map is continuous, indeed infinitely differentiable. 
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The real line R. has a natural abelian group operation +, and we can 
use the map 2 to transport this to S? if we define @ as follows: 


(21, 41) ® (2, yo) = (2142 — yi y2, F1y2 + y122). 


Under this operation S* is an abelian group with identity element (1,0). 
The inverse of (c, s) is (c, —s). The addition formulas for s,c now say that 
Q is a group homomorphism, that is, 


Au tv) = A(u) @ Av). (14.12) 


The map 2 cannot be a group isomorphism since S? is compact but R 
is not. Therefore Q has a non-trivial kernel K. We claim that there is a 
unique real number w > 0 such that 


K=oZ 


To see this observe that the derivative of 9 is nonsingular near u = 0, so 
there exists « > 0 such that Q(u) # (1,0) whenever 0 # u and |u| <e. It 
follows that there exists a smallest real number w > 0 for which Q(w) = 
(1,0). Then Q(nw) = (1,0) for all n € Z since K is a subgroup. We claim 
that K = wZ. If not, there exists k € K such that 


na<k<(n+1)0 


for some n € Z. But then k—nw € K and 0 < k—nw < w, acontradiction. 

Numerical computations show that w ~ 6.28, and of course w = 27. 

Since 1) is a homomorphism and w € K, we deduce that O(u+m) = 
Q(u) for all u € R (hence also for all wu € C). That is, both s and c are 
@-periodic. 

All of the usual properties of the trigonometric functions now follow 
by standard methods. In particular we can show (Exercise 2) that 1 is 
onto. Geometrically, Q wraps R round S? infinitely many times, in such a 
manner that the usual distance in R becomes 1/27 times arc-length in S'. 


14.1.3. The Pythagorean Equation 


The group structure on S! defined by @ has an interesting implication for 
the Pythagorean Equation. Namely, given two solutions 


x+y? ai 
w+ee=1 
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it implies that there exists a further solution 
(zu — yv)? + (xv + yu)? = 1. (14.13) 


For example, from the standard (3,4,5) right triangle we know that 
xz = 3/5,y = 4/5 is a solution, and so is u = 3/5,v = 4/5. By (14.13) 
we compute a new solution (7/25, 24/25), so that 7? + 24? = 252. In this 
manner we can obtain an infinite number of rational solutions of (14.2), 
hence of integer solutions of (14.1). 

We now seek to characterise those u € R for which N(u) € Q’, that is, 
both s(u) and c(u) are rational. To this end we introduce 


U 
t=t(u) =tan—. 
(u) = tan $ 


From the identities 


cosu = cos” > — sin? 
U U 
1= 2” 2 
cos > «an 5 
we find that 
ae sin u = ail 
~14+# 148° 


Proposition 14.1. Q(u) € Q? if and only ift € Q. 


Proof: Clearly t € Q implies 2(u) € Q?. 
If 2(u) € Q? then 


14+¢ 
a = peq, 
1-#? 
a a qeEQ 
so that 
1+t? = 2pt, 
1—¢? = 2Qqt. 
Adding, 
2=2(p+q)t 


a oe 
so that p+q #0 andt= > > €Q. O 
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In other words, 9(u) € Q? if and only if u € 2arctanQ. 
The identity (14.4) is now equivalent to the rational identity 


gear yt oe \4 

Sao =e ed 

1+¢#? 1+ 
which we have already encountered in Example 13.9. Putting t = u/v 
where u,v € Z this yields 


(u? — v?)? + (Quv)? = (u? + v?)?. 


In Lemma 11.1 we showed that all primitive integer solutions of the 
Pythagorean equation (that is, solutions without common factors) are of 
this form. Thus we have found a link between the trigonometric functions 
(especially t(u)), the Pythagorean equation, and the unit circle in the plane. 


14.1.4 A Curious Series 


Later in this chapter we will develop a profound generalisation of the above 
to elliptic curves. We could continue to explore the circle and its links for 
some time, but we content ourselves with one curious formula that makes 
the 27-periodicity of the trigonometric functions ‘obvious’. Its analogue for 
doubly periodic complex functions forms the basis of Weierstrass’s approach 
to elliptic functions. The relevant identity is 


1 i ae i 
esc z= > + De (—1) = + =) : (14.14) 
neZ\ {0} 


This series is absolutely convergent and so may be differentiated term by 
term, yielding the simpler identity 


esc zcot z = ye, oes (14.15) 


The series in (14.14) would itself be simpler if we could use 
S ss (14.16) 


but unfortunately this series fails to converge, which is why the more com- 
plicated (14.14) replaces it. 

If we replace z in (14.15) by z+ 2m then the entire series shifts one place 
along, term by term. This makes it obvious that the function csc zcot z 
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is 27-periodic in z. It is straightforward to parlay this result into 27- 
periodicity of sin z and cos z. So it is possible to define the trigonometric 
functions in terms of a series whose 27-periodicity is immediately apparent 
from its form. 

We outline the derivation of (14.14). We rely on standard ideas from 
complex analysis about residues. Suppose f(z) is a complex analytic func- 
tion whose only singularities in C are poles z = aj,j7 = 1,2,3,... with 
residues b;, where 0 < |ai1| < |a2| < |a3| < ---. Suppose that there ex- 
ists a sequence of circles Cj, with centre the origin and of radius R; which 
tends to infinity with 7, not passing through any poles, with f(z) uniformly 
bounded on the circles C;; that is, |f(z)| < M for all z € U;C;. 

An example is f(z) = cscz, with Rj = (j + $)7, to which we return 
shortly. 

By the residue theorem (Stewart and Tall [73] Chapter 12) if 2 is not a 
pole of f then 


1 f(z), by 
rere Wierd LD Saar 


where the sum is over all poles interior to C;. Now 


1 f@), 1 f f/f, @ f (2) 
Qi C; ca Ori oc, % ae Qni [ ea) 
- Oe ge f(z) 
= F(0) ee: + On C; ra ad 


tends to zero, and we get 


F(z) = £00) + Yoon ( : +a). (14.17) 


Z—-QAn Qn 


To prove (14.14) we now set f(z) =cscz— +4. The singularities are at 
z=a; = jn, j #0, with residues b; = (—1)’. The conditions of the above 
calculation are easily checked, so (14.14) follows. 

Weierstrass uses a very similar series to define a class of doubly periodic 
functions (and has a similar problem with convergence of the simplest form 
of the series, which he solves in the same manner). We now begin the 
development of Weierstrass’s theory. 
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We have seen that the trigonometric functions have a number of striking 
properties, including: 


e Periodicity: sin(@ + 27) = sin(@), cos(@ + 27) = cos(0) 


e Algebraic differential equation: u’/(6)? = 1—u(@)? where u(@) = sin(@) 
or cos(6@) 


e Parametrization of circle: (cos(6),sin(@) lies on the unit circle x? + 
y? = 1 and every point on the circle is of this form 


Addition theorems: cos(@ + ¢) = cos(@) cos(@) — sin(@) sin(@), sin(@ + 
¢) = sin(9) cos(¢) + cos(@) sin(¢) 


Integration of algebraic functions: for example, 


l= = sin ‘(z). 
V1— 2? 

These properties are all interconnected; moreover, they illuminate the 
theory of quadratic Diophantine equations, and in particular the Pythagorean 
Equation X? + Y? — Z? = 0, which in affine form is x? + y?-—1=0. 

In 1811 Legendre published the first of a series of three volumes initi- 
ating a profound generalization of trigonometric functions, known—for the 
rather peripheral reason that they lead to a formula for the arc length of an 
ellipse—as ‘elliptic functions’. In fact Legendre worked only with ‘elliptic 
integrals’, of which the most important is 


dz 


for a complex variable z and a complex constant k. This particular integral 
is the elliptic integral of the first kind: in Legendre’s theory there are also 
elliptic integrals of the second and third kind. A detailed treatment can be 
found in Hancock [33] page 187. 

Gauss, Abel, and Jacobi all noticed something that had eluded Legen- 
dre. (The history is complicated. Gauss never published the idea; Jacobi 
made it the basis of his monumental and influential work published in 1829; 
and a manuscript by Abel was submitted to the French Academy of Sci- 
ences in 1826, mislaid by Cauchy, and not published until 1841, by which 
time Abel was long dead. Abel also published work on elliptic functions 
in 1827, however.) Their common idea was to consider the integral (14.18) 
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not as defining a function, but as defining its inverse function. Denote this 
inverse function by sn u: it satifies the equation 


F(k,sn u) =u. 


Strictly speaking, sn is a family of functions parametrized by k. Associated 
with it are two other functions 


cnu = V1—sn? u, 
dnu = V1—k*sn? u. 


These functions have remarkable properties reminiscent of the trigonomet- 
ric functions: a sample is 


siGeepy snzcnydny+snycnzdnz 

1— k? sn?z sn?y ; 
A vast range of similar identities can be found in Cayley [13] and Han- 
cock [33]. The most remarkable property of all, though, is that the func- 
tions sn,cn,dn are all doubly periodic. That is, there exist two complex 
constants w ,w2 (depending on k) that are linearly independent over R, 
such that 


sn(z +41) = sn(z+w2) =snz, 
cn(z + ) = cn(z+w2) =cnz, 
dn(z +) = dn(z+u2) =dnz. 


Legendre had recognised the equivalent property of his elliptic integrals, 
but its expression is far more cumbersome. Moreover, sn,cn,dn are all 
meromorphic functions of a complex variable, meaning that they are ana- 
lytic except for isolated singularities, which are all poles. (See Stewart and 
Tall [73] or any other text on complex analysis for terminology.) 

In 1882 Weierstrass developed a somewhat different approach to the 
whole topic of doubly periodic functions, based on a function denoted gz. 
(The symbol g is pronounced ‘pay’ and is a stylized old German ‘p’.) The 
Weierstrass ¢-function is closely connected with elliptic curves, and the 
remainder of this section is devoted to this connection. 

The starting-point is to consider an arbitrary doubly periodic meromor- 
phic function f(z), where z € C. Then there are two complex constants 
Ww ,W2 that are linearly independent over R, such that 


f(z+wi) = f(z + we) = f(z). (14.19) 
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This implies that for all (m,n) € Z? 
f(z+ muy + nwe) = f(z) (14.20) 


and opens up some interesting geometry of C. 


Definition 14.2. Let w1,we € C be linearly independent over R. Then 
the set 


L= Loy wg = {2 € C22 = mu + nwe where m,n € Z?} 
is the lattice generated by wy , we. 


We studied lattices in R” in Chapter 6. The above definition is a special 
case, and arises when we identify C with R?. When (14.20) holds, we say 
that f is L-periodic. 

Suppose that T is the fundamental domain of £, see Chapter 6 just 
before Lemma 6.2. Then by Lemma 6.2, every z € C lies in exactly one 
of the sets [+1 for 1 € L. Therefore f(z) is uniquely determined once 
we know f(t) for all t € T. Classically, the topological closure T of T is 
called the period parallelogram. We can consider f to be a function on the 
quotient torus T? = C/L. 

The main problem is to define ‘interesting’ doubly periodic functions 
for a given lattice £. Weierstrass’s idea is a generalisation of (14.14,14.15), 
namely that these can be obtained by summing over translates by the lat- 
tice. That is, take some (initially arbitrary) function g(z) and consider 


5(z) = Do 9(z-D. 


lEL 


Then g is obviously doubly periodic, because for any I’ € L 


i240) =o o(z-141) = DO oz +2") = H(z) 


leL WEL 


where l” = I’ — 1. 
This is all very well, but it is necessary to choose g(z) with care. Three 
things can go wrong: 


e The series defining g fails to converge. 
e The series converges but g is not meromorphic. 


e The series converges but g turns out to be constant. 
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Avoiding these pitfalls requires a certain amount of foresight. It led Weier- 
strass to the choice g(z) = ~y, for which §(z) = —g'(z), and from this to 
a suitable choice of (some constant multiple of) the integral of this partic- 
ular g. It turns out that taking g(z) = + is not a good idea: the series 
for g fails to converge. However, by adding a suitable ‘arbitrary constant’ 
(which, taken on its own, also fails to converge, but in the ‘opposite’ way) 


he could obtain a meromorphic function. His final choice was: 


Definition 14.3. Let £ C C be a lattice. Then the associated Weierstrass 
go-function is 


p(z) = = + (=F — ) (14.21) 


1EL\0 


This series is absolutely convergent provided z ¢ L (Exercise 3). Ev- 
idently g is an even function, that is, g(—z) = (z). We may therefore 
differentiate term by term to get 


1 
62) =-2) oop 
tee (2 —9) 
This series is also absolutely convergent provided z ¢ £L (Exercise 3). 


Lemma 14.4. The function g is meromorphic, and doubly periodic on the 
lattice L. 


Proof: By the above discussion, g’ is meromorphic, and doubly periodic 
on the lattice £. Therefore 


g!(z + w;) = g'(z). 
Integrate, and remember to include an arbitrary constant: 
p(z + 4;) = p(z) +e; G =1,2) (14.22) 


for constants c; € C. Since g is an even function, its derivative is odd, so 
¢'(-—z) = —(z). Setting u = —w in (14.22) we have 


(1) = p(-w1) +61 = p(w) +e 
so c; = 0. Similarly co = 0, so g is £-periodic. 


By absolute convergence, (z) is analytic for all z ¢ £. When z € £ we 
may without loss of generality take z = 0. Then (14.21) shows that ¢(z) 
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has a simple pole at 0, of order 2. Therefore all singularities are poles, and 
g is meromorphic. O 


We now prove a useful lemma: 


Lemma 14.5. Suppose that f is a doubly periodic function that is analytic 
throughout C (that is, no poles or other singularities). Then f is constant. 


Proof: Let T be the closure of the fundamental domain of £ (the period 
parallelogram). Since T is compact and f is analytic on T, there exists a 
real constant M such that |h(z)| < M for all z € T. By double periodicity, 
|h(z)| < M for all z € C. By Liouville’s Theorem (Stewart and Tall [73] 
page 184) h(z) is constant. Oo 


Define 


92 = 60 iec\o 
93 = 140 Yrer\o w- 
(These and similar expressions are called Eisenstein series.) Then it may 


be shown (Hancock [33] page 324) that the Laurent series (Stewart and 
Tall [73] page 195) of —, ~’ take the following form: 


(14.23) 


= 1 92 2 93 4 
p(z) = mt o9* + ya9% te (14.24) 
—2 9, 93.3 
U =—- “-_ — — 
g'(z) = se tigtt 7 te: (14.25) 


Theorem 14.6. The Weierstrass g-function satisfies the differential equa- 
tion 


9 (2)? = 49(z)° — ga@(z) — 93 (14.26) 


Proof: By direct computation, 


g'(z)? = a eat O(2) (14.27) 
1 3g 39: 
g(z)? = wt ee + re + O(2) (14.28) 


where O(z”) denotes a function whose Laurent series begins with terms in 
2? or higher—which, in particular, is an analytic function for all z € C. 
Therefore 


@' (z)? — [4p(z)* — g2@(z) — gs] = O(z’). 
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The left-hand side, which we denote by h(z), is doubly periodic; the right- 
hand side is analytic throughout C. Lemma 14.5 implies that h(z) is con- 
stant. Since h(z) = O(z?), it follows that h(z) = 0. Oo 


Corollary 14.7. Let C be an elliptic curve in Weierstrass normal form 
y? = 42° — gox — gz and let g be the corresponding Weierstrass g-function. 
Then (x,y) = (p'(z), p(z)) lies on C for allz EC. 


In fact, every point on C is of the above form, so the function ¢ para- 
metrizes C’. There is a close analogy with the parametrization of a circle 
by (cos(6), sin(@)), especially since sin’(@) = cos(@). Moreover, the para- 
metrization of C’ behaves naturally with respect to the group operation 
(13.9) on C, which for clarity we now rename © instead of +. (This 
is not the same @ that arose earlier in this chapter in connection with 
trigonometric functions.) More precisely, let C be an elliptic curve with 
equation y? = 4x3 — goa — g3. Then C is nonsingular if and only if the 
cubic 423 — gox — g3 has distinct zeros, which happens if and only if the 
discriminant 


Og + 3293 £0 


There is one point at infinity on C, namely (0,1,0), and we choose this as 
the identity O of the group G. Straightforward calculations in coordinate 
geometry then show that if 


(x3, y3) = (£1, 41) ® (£2, ye) 


then 
1, y1 — Ye\2 
= 14.2 
Saat, 7 racers wa — (4, + 22) (14.29) 
x — Zz 
= Yi — Yo eee 192 2Y1 (14.30) 
1-2 w1— 2X2 


We now compare this with the addition theorem for the functions ¢, 9’: 


Theorem 14.8. Ifux#vue€C then 


tuts) = 7 (EEO) — oy + 010) (14.31 


(v) 
p'(u) ~ »'(v) p(u)p'(v) ~ plu) g"(u) 


(+9) = Ta) ple) POT pe) 


(14.32) 
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Proof: We sketch a proof: see Hancock [33] page 351 for details. Consider 
the function 


- 1 /@l(u) — @(v)\? 
ae a ae ( p(u) — pte) ) 


This is doubly periodic and meromorphic: its only poles are at points 
where u+v € £L. However, by construction it is finite at u = —v, hence 
at u = —v+1 for alll € £. At u=0 the function h(u) goes to infinity 
like —,. It follows that h(u) + e(u) is doubly periodic and analytic for all 
u€C. Lemma 14.5 implies that h(u) + (uz) is constant. Setting u = 0 it 
follows that the constant is —g(v). 

The addition formula for g’ can be obtained by differentiation and fur- 
ther manipulations. O 


These formulas may appear complicated, but they lead immediately to 
a very elegant theorem: 


Theorem 14.9. The group structure on C has the property 


(9(u), '(u)) © (w(v), @'(v)) = (w(u+ v), @'(u+ v)). (14.33) 


Proof: Compare formulas (14.29) and (14.31), and (14.30) and (14.32). 
O 


Another way to say this is that the map u +> ((u), g/(u)) is a homo- 
morphism between (C,+) and (C,®). 


14.3 Legendre and Weierstrass 


Legendre’s theory of elliptic integrals concerns the square root of a quartic 
polynomial (1 — z?)(1 — k?z?). Weierstrass’s theory revolves around the 
square root of a cubic polynomial 4z? — goz — g3. We briefly describe the 
connection between the two approaches, which is essentially that suitable 
birational maps transform the Legendre normal form into the Weierstrass 
normal form, and conversely. The discussion is summarized from Han- 
cock [33] page 190. 
Consider the integral 


[—t#— er 
V 425 — gaz — gs 
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By Theorem 14.26 the substitution z = g(t) transforms this into 


i dt =t (14.35) 
so that (14.34) is equal to g~1(z). The similarity with (14.18) is striking. 
In fact, elementary computations show that if we make the substitution 


_ a3 +42 , a3—Q2 z—k 
=" 39 2 1-kz 


1-k 2 a a, — ag 
1+k = a, — ag 
then (14.18) becomes 


Ve | Weta 


Moreover, we can change variables from w to u = w +c, for a suitable 
constant c, an eliminate the quadratic term in (w — a1)(w — a2)(w — as), 
reducing the cubic to Weierstrass normal form: 


in (14.18), where 


Il aes 7 / eS 


g2 = —A(a a2 + aga3g + 4301) 
93 = 4a a2a3. 


where 


The transformation is invertible, so we can also change an elliptic integral 
in Weierstrass normal form to one in Legendre normal form. In short, the 
two theories are equivalent. 

This brief description conceals some beautiful mathematics that ex- 
plains the relation between quartics and cubics that is exploited in the 
above transformation. This involves the invariants of cubic and quartic 
curves, the ‘resolvent cubic’ of a quartic equation, and the cross-ratio in- 
variant from projective geometry. See Hancock [33] Chapter VIII. 


14.4 Modular Functions 


The link between elliptic curves and Fermat’s Last Theorem stems from 
a profound generalisation of doubly periodic complex functions. Liouville 
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proved that every single-valued doubly periodic meromorphic function on 
a given lattice can be expressed as a rational function of Weierstrass’s g- 
function and its derivative (Hancock [33] page 437). This theorem classifies 
all such doubly periodic functions, and at first sight leaves little room for 
generalisations. However, translations are not the only interesting trans- 
formations of the complex plane. 

Suppose that Tis some group of invertible maps C > C. Then we can 
seek complex functions f that are invariant under I’, meaning that 


f(y(2)) = F() 


for all z € C,y €T. Doubly periodic functions arise in this way if we take 
I to be the group of all translations 


Ze z+me,+ nwo 


by elements mw, + nw of the lattice L. 
The question is: what groups I lead to interesting results? Translations 
are a special case of an important class of transformations of C: 


Definition 14.10. Let a,b, c,d € C with ad — bc #4 0. Then the function 


_ az+b 
~ ezt+d 


g(z) 
is called a Mébius map or bilinear map. 


There is a technical problem with such functions: they take the value 
oo when z = —d/c. The usual way to get round this is to extend C to the 
Riemann sphere C U {oo}, see Stewart and Tall [73] page 207. If this is 
done, the set of Mébius maps forms a group under composition (Exercise 
4). 

Straight lines and circles in C correspond to circles on CU {00}. More- 
over, every Mobius map sends circles on CU{oo} to circles. Being complex- 
analytic, every Mébius map is conformal: it preserves the angles at which 
curves meet. So M6bius maps have several remarkable properties. Trans- 
lations are Mobius maps: take a = 1,c=1,d=0 to get z'» z+. So we 
may hope to find generalisations of doubly periodic functions among the 
Mobius maps. 

The trick now is to choose fruitful subgroups of the group of M6bius 
maps. Taking the whole group leads to nothing of interest, because it is 
easy to see that a function that is invariant under all translations z 4 z+b 
must be constant. As a clue, the translations by a lattice are defined in 
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terms of a pair of integers (m,n). This leads to the following choice of 
group: 
Definition 14.11. The modular group is the group of all Mébius maps 


_ azt+b 
~ cz+d 


9(2) 
where a,b,c,d € Z and ad—be=1. 


It is easily seen to be a group (Exercise 5). Abstractly, it can be de- 
scribed in terms of the group SL2(Z) of all matrices 


ab 

cd 
with a,b,c,d € Zand ad— bc = 1. In fact, if Z is the subgroup comprising 
+I where I is the identity matrix, the modular group is isomorphic to 


PSL2(Z) = SL2(Z)/Z 


which is known as the projective special linear group. See Exercise 6. 

For any lattice £, the group of lattice-translations has a fundamental 
domain, as describe earlier. The defining property of a fundamental do- 
main is that every point of C lies in exactly one of its translates by the 
lattice. The modular group also has a fundamental domain, but this has 
more subtle geometry than a parallelogram. Its construction is assisted by 
defining two particular elements of the modular group, namely 


(14.36) 
oe ie! 
~ |O14]- 
These correspond to the functions 
1 
S(z) = -= 
() =-= 
T=2z+1. 
It is easy to see that S? = —I which in PSL2(Z) represents the same 


element as J, that T has order oo, and (ST)? = I. See Exercise 7. 
Let H = {z : Im(z) > 0} be the upper half-plane in C. (Despite our 
choice of terminology earlier, the pictures we shall draw make it difficult 


14.4. Modular Functions 253 


Figure 14.1. Fundamental domain for the modular group. 


to call this the upper half-line.) It is easy to check that the modular group 
maps H to itself. We define the modular domain D by 


D = {z: —3 < Re(z) < 0,|z| =1 or — 3 < Re(z) < 3, |z| > 1}. 
(See Figure 14.1.) 


Theorem 14.12. D is a fundamental domain for the modular group acting 
on H. 


Proof: See Goldman [31] page 184. O 


The proof shows more: the effects of S and T on D are as in Figure 14.2. 


Figure 14.2. Tesselation of the upper half-plane by images of the fundamental 
domain. 
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One final ingredient is needed before we can define an elliptic modular 
function, namely: the concept of a Riemann surface. This concept was 
introduced in Riemann’s Inaugural Dissertation [61] of 1851, as a general 
method for making sense of ‘multi-valued’ complex functions. We explain 
the idea briefly for the function f(z) = /z, and then describe an abstract 
generalization in which the surface is not associated with a previously de- 
fined function. 

Every non-zero complex number, like every non-zero positive real num- 
ber, has two distinct square roots. Indeed if z = re*, then /z = +4/re®®/? 
or —1/re**/2, When z is real and positive, the two choices can be distin- 
guished by their sign, and it is reasonable to consider the positive square 
root as the ‘natural’ choice and the negative one as a secondary alternative. 
In the complex case, no straightforward distinction of this kind is possible, 
for the following reason. The complex setting reveals exactly why there 
are two choices of sign: the key point is that —/re“?/? = +,/re’®/2+™ — 
+,/re!(9+2")/2, That is, there are two alternatives not because the modu- 
lus r has two real square roots, but because the argument @ is defined only 
modulo 27, and different choices of argument lead to two different values 
for the square root. The role of the argument is clearer if we consider the 
cube root */z, which takes any of three values: 


Vref /3 of/rel9/3+20/8 re’? /3+4n/3 


because the choices of argument 6,0 + 27,6 + 47 lead to different results. 
(However, 6 + 67 leads to the same cube root as 6.) 

The impossibility of defining one choice to be the ‘natural’ square or 
cube root becomes obvious if we consider what happens as z moves along 
a continuous path in C. For example, suppose that z = e” and ¢ runs 
from 0 to 2x. When t = 0 the two square roots are +1,—1. The same is 
true when ¢ = 27. However, if we require ,/z to vary continuously with 
z, then 1 lies on the path of square roots given by Ve# = e#*/2, and —1 
lies on the path of square roots given by Veit = —e't/2_ As t increases 
from 0 to 27, the choice 1 changes continuously into —1, while the choice 
—1 changes continuously into 1. That is, the choices 1,—1 swap places. 
Therefore neither can be considered more natural than the other. 

Prior to Riemann, such phenomena were handled by declaring the func- 
tion to be ‘multi-valued’, and prescribing rules for how choices of values 
should be made. Riemann’s idea for coping with such behaviour is radically 
different: define the function to be single-valued (as is now conventional 
whenever the word ‘function’ is used), but specify a domain that is dif 
ferent from the usual complex plane C. For the function ,/z Riemann’s 
construction can be described in terms of two superposed copies C, and 
C2 of C \ {0}, slit from 0 to —oo along the real axis. The top left-hand 
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quadrant of C, is glued to the bottom left-hand quadrant of C2, and the 
top left-hand quadrant of C2 is glued to the bottom left-hand quadrant 
of C,. If we try to draw the resulting surface S in R® then it is forced 
to intersect itself, but abstractly no such self-intersection is implied by the 
gluing recipe. 

There is a canonical projection p : S + C \ {0} which identifies each 
copy C; of C\{0} with C\{0}. Any continuous path y(¢) in C\{0} lifts toa 
continuous path ¥(¢) in S, by which we mean that p(¥(t)) = 7(¢). Suppose, 
for example, that y(¢) describes the unit circle in C \ {0}, anticlockwise, so 
that y(t) = e* for 0 < t < 27. As t increases from 0 to 7, this curve lifts 
to 


F(t) = a EC. 


However, because two two sheets C,, Cy are cross-connected along their 
negative real axes, the curve lifts to 
4(t) =e © Cy 

as ¢ increases from 7 to 27. So the lifted path, unlike the original, is not a 
closed loop: instead, it returns to a different sheet, Cy rather than C). 

If the argument of z describes the path y(¢) in C \ {0}, and we choose 
a continuously varying argument for ,/z, then the choices +,/z change in 
the same manner as the sheets of the surface S. That is, we can define a 
single-valued square root on S. This is Riemann’s idea. The surface S is 
one of the simplest examples of a Riemann surface in the classical sense: for 
further examples and proofs see Stewart and Tall [73] page 268 onwards. 

The modern treatment of a Riemann surface is based on the geometry 
of the surface and the ‘complex structure’ given by its relation to C: 


Definition 14.13. A surface S is a topological space, covered by a count- 
able collection of open subsets U called patches. Each patch U is equipped 
with a local coordinate map ay : U — D where D C R? is the open unit 
disc. Finally, if « € S lies in the overlap UM V of two patches, then the 
overlap map ay;'ay must be continuous where it is defined. 

A Riemann surface is defined in the same way, but now we consider D 
to be the open unit disc in C, and the overlap maps are required to be 
conformal, that is, to preserve the angles at which curves cross. 


For further details see McKean and Moll [52] pages 3-5. 
We can now define an elliptic modular function: 


Definition 14.14. Let N > 0 be an integer, and define a subgroup 
To(N) C SL2(Z) 
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by 


To(N) = ({° ‘| ,a,b,c,d € Z,ad — be = 1, N|c}. 


This group acts on H and there is a compact Riemann surface Xo(N) such 
that 


H/To(N) = Xo(N)\K 


where XK is some finite set of points. The points in K are the cusps of the 
modular curve X(N) of level N. 


Definition 14.15. An (elliptic) modular function of level N on H is a 
function f(z) that is invariant under I'9(N) and descends to a function 
that is meromorphic on Xo(NV), even at the cusps. 


(Here ‘descends’ just means that f has the same value on each orbit of 
I'o(N) on H, and hence defines a function on the quotient space H/To(NV).) 


14.5 The Frey Elliptic Curve 


A major reason why problems like Fermat’s Last Theorem remain unsolved 
for centuries is that it is difficult to find a reasonable line of attack—a place 
to start from. As we have seen, the ‘big idea’ of the 1840s and 1850s was 
to reformulate the problem in a cyclotomic number field. This idea led to 
significant progress, and although in the end it failed to prove Fermat’s Last 
Theorem, it left a legacy that was far more important than the theorem 
itself: the whole machinery of ideals in algebraic number theory. This 
kind of development is quite common in mathematics: the significance 
of a notorious unsolved problem often lies not in its answer (nothing of 
great importance would follow easily or directly from knowing whether 
Fermat’s Last Theorem is true or false) but in the methods that the search 
for an answer can open up. Such problems serve as glorious reminders 
of areas of massive ignorance, and quell any belief that mathematics is 
‘pretty much worked out’. As it turned out, Fermat’s Last Theorem has 
stimulated the creation of several major mathematical theories, whose far- 
reaching consequences are still being discovered. Wiles’s ideas, leading 
to a complete proof of the Taniyama—Shimura—Weil Conjecture—which is 
important in its own right because it opens up new lines of attack on all 
kinds of questions—is the latest addition to the list. 

When the approach to Fermat’s Last Theorem by way of cyclotomic 
number fields ground to a halt in the 1980s, no plausible line of attack 
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seemed to be visible. That situation changed dramatically with the work 
of Hellegouarch [37] and Frey [26, 27], who indulged in some major lateral 
thinking that revealed a startling link between Fermat’s Last Theorem and 
the theory of elliptic curves. Bearing in mind that elliptic curves form one 
of the deepest areas of number theory, one equipped with a vast array of 
powerful machinery, the significance of this breakthrough was immediately 
evident to number-theorists. 

It was Frey, above all, who made this link solid and complete. His 
idea is to try to prove Fermat’s Last Theorem by contradiction (of itself an 
obvious way to proceed and nothing new) by assuming that there exist three 
pairwise coprime non-zero integers a, b,c that satisfy the Fermat Equation 


aP + bP =P 


for an odd prime p. On this assumption, define the Frey elliptic curve F 
by 


y? = x(x — a?)(a + bP). 


Since the above solution gives rise to two further solutions b? + aP = cP 
and a? + (—c)? = (—b)? we can arrange for b to be even and a = —1 
mod 4. These conditions are needed to make F ‘semistable’, a notion that 
we discuss below. For technical reasons, it is also useful to assume p > 3, 
which involves no loss of generality since both Fermat and Euler proved 
Fermat’s Last Theorem for p = 3. 

Frey’s chief contribution was to recognise that the curve F has such 
strange properties that it cannot possibly exist. If this could be proved, 
then by contradiction there could be no solutions to the Fermat equation, 
and Fermat’s Last Theorem would be proved. Moreover, Frey [27] provided 
strong but incomplete evidence why F cannot exist. Namely, if it does, then 
it contradicts the Taniyama-Shimura—Weil Conjecture. The main gap in 
his argument was filled in by Serre, but only by invoking a conjecture of his 
own, the Special Level Reduction Conjecture. In 1986 Ribet [59] proved 
Serre’s Special Level Reduction Conjecture (the proof was not published 
until 1990), at which stage the hoped-for proof of Fermat’s Last Theorem 
rested only on the Taniyama-Shimura—Weil Conjecture. It was a suffi- 
ciently powerful special case of this conjecture that Wiles attacked, over a 
period of seven years, and (not without hiccups) demolished. 


14.6 The Taniyama-Shimura—Weil Conjecture 


The Taniyama-Shimura—Weil Conjecture (often also attributed to some 
subset of those three mathematicians) can be stated in numerous forms, 
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which look very different but are equivalent given the state of knowledge 
in the field in the 1980s. In essence, it is this: 


Conjecture 14.16. (Taniyama-Shimura—Weil.) Every elliptic curve 
over Q is modular. 


However, in order for this to make sense we have to explain what it 
means for an elliptic curve to be modular, and this is where the different 
alternatives arise. 

The best known approach involves ‘reduction modulo p’, where p € Z is 
a prime. Suppose that F is an elliptic curve over Q, and let F, be the field 
with p elements. We can write E in projective form as a homogeneous cubic 
with integer coefficients, and we can then reinterpret those coefficients as 
integers modulo p. We then get a cubic equation over Fy, also in projective 
form; and we define b, to be the number of distinct solutions over Fy, 
including any that lie at infinity. For instance, suppose that EF has affine 
equation 


y” = 2° + 22 
so that in projective form it becomes 
y?z = 2° + 2228 
and let p = 5. In Fs we have 22 = 2, so we are trying to count the 
projectively distinct solutions of 
yz = 2° + 228 
with x,y,z € F;. By trial and error we find that there are exactly 6 of 
them, namely 
(x,y; 2) = (0,1,0), 
(1, 0, 3), 
(1,1,2), 
(1, 1, 4), 
(1,4, 2), 
(1,4, 4). 
The first two involve 0 so are at infinity. For example, when (x,y,z) = 


(1,4,4) then y*z — (x? + 223) = —65, which is congruent to 0 modulo 5. 
(Remember that if any solution is multiplied throughout by a non-zero 
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constant, then projectively this is the same solution—so that, for instance, 
(1, 1,2) is the same as (3,3,1).) We conclude that for this E, 


bs = 6. 


The numbers 6,, for various p, encode useful information about E. 
When F is modular, there is a formula relating all the numbers 6,, for all 
primes p, to a single function. This function is called an eigenform. It can 
be written 


f= > ae (14.37) 


n=1 


and has some very specific properties (technically, it is a normalized cusp 
form of weight 2 for the I'9(V) of Definition 14.15, and it is an eigenfunction 
for all Hecke operators). We then have: 


Definition 14.17. An elliptic curve E over Q is modular if there exists an 
eigenform (14.37) such that for all but finitely many primes p, 


bp =p+1-— ay. 


It is now known that this definition leads to an alternative, equivalent 
formulation of the Taniyama—Shimura—Weil Conjecture, which states that 
FE can be parametrized by modular functions of a certain kind—much as 
the circle can be parametrized by trigonometric functions and every elliptic 
curve can be parametrized by the Weierstrass g-function and its derivative: 


Conjecture 14.18. (Taniyama-Shimura-Weil, alternative formulation.) 
Let y* = Ac? + Ba? + Cx+D be an elliptic curve, where A,B,C,D € Q. 
Then there exist modular functions f(z),g(z), both of the same level N, 


such that 
9(z)? = Af(z)® + Bf(z)? +Cf(z)+D 


See Cox [15] and Mazur [50] for further information on this alternative 
formulation. 

Wiles did not prove the full Taniyama—-Shimura—Weil Conjecture, al- 
though this has now been done as a consequence of Wiles’s ideas. He 
realised that a more accessible special case would be sufficient: this is 
known as the Semistable Taniyama—Shimura—Weil Conjecture. In order to 
explain what ‘semistable’ means, we must discuss some numerical invari- 
ants of elliptic curves. Suppose, then, that we have a rational elliptic curve 
y? = Ac? + Ba? ++Cx+D. Over the complex numbers, the cubic equation 


260 14. Elliptic Functions 


p(x) = Az? + Br? +Cxz+ D =0 has three roots 11, 22,23. Classically, the 
discriminant of p(x) is defined to be (x1 — r2)*(x1 — 23)"(xq — 23)”. It is 
so named because it vanishes if and only if p(x) has a multiple root, so it 
‘discriminates’ between the roots. It is the forerunner of the discriminant 
of an algebraic number field, defined in Chapter 2 Section 2. We may now 
define four invariants of the Frey curve. 


e The discriminant, which equals a??b??c??. Already we see that the 
Frey curve is special, since this is a perfect 2pth power, a highly 
unusual circumstance. 


e The minimal discriminant, equal to 2—°a2?b??c??. Since b is even and 
p = 5, this is an integer. 


e The conductor, which is the product of all primes that divide a, b, 
or c. If an elliptic curve is modular, then it can be parametrised by 
modular functions whose level is equal to the conductor of the curve. 


6 ; : 8 (2p 4 p2p 4 opppy3 7 

e The j-invariant, which equals 7% ai ae YY and is a complete 

invariant in the sense that any two elliptic curves with the same j- 
invariant are isomorphic over C. 


Now we can define semistability. 


Definition 14.19. An elliptic curve is semistable if whenever a prime | > 3 
divides the discriminant, only two of the three roots of p(x) are congruent 
modulo J; and similar but more technical conditions on the primes 2 and 3 
hold. 


Lemma 14.20. The Frey curve F is semistable. 


Proof: The discriminant is a??b??c?? and the roots are 0,a?,—b? where 
a?,b? are coprime. When | = 2,3 the conditions b even, a = —1 (mod 4) 
are also required in the proof. O 


Wiles’s main result is: 


Theorem 14.21. (Semistable Taniyama-Shimura—Weil Conjecture.) 
Every semistable elliptic curve over Q is modular. Oo 
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Thus every semistable elliptic curve over Q can be parametrised by 
modular functions of some level N (equal to the conductor). This leads to: 


Lemma 14.22. Ifl is an odd prime dividing N, then the j-invariant of F 
can be written in the form |~™?q where m > 0 andl is a rational number 
whose numerator and denominator in lowest terms are not divisible by l. 


Proof: The j-invariant of F is 
28 (a7? + bP + aPb?)3 — 28(c?P — bPcP)3 


a?Pb?P ¢2P ~ — (abe)?P 
The power of / dividing the denominator is a multiple of p. Since a, b, c are 
pairwise coprime, the above fraction is in lowest terms. The result follows 
since N is the product of the primes dividing abc. O 


This lemma fails for | = 2 because of the factor 2°. 


14.7 Sketch Proof of Fermat’s Last Theorem 


We can now sketch Wiles’s proof of Fermat’s Last Theorem. We need 
one further ingredient from complex analysis: that of a modular form of 
weight 2. We begin with an elliptic integral of the first kind 


a. ee 
JVAz? + Bx? +Czr+D 


Setting y? = Az® + Bx? + Cx + D, defining an elliptic curve, this becomes 

f dz If the elliptic curve is modular then there exist modular functions 

f(z), g(z) such that x = f(z), y = g(z) parametrises the curve. In this case 
dx df _ f'(z)dz 


ges a) = F(z)dz 


where 
(2) 
9(z) 


It is not hard to see that although F is not a modular function, it comes 
close: 


F(z) = 


az+b 


=m 7) = (cz+ d)?F(z). 


F( 
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In this case we say that F is a modular form of weight 2 and level N. 
Modular forms of this type have some distinctive features: in particular, if 
the choice of parametrisation (f(z), g(z)) is chosen carefully, the function 
F(z) is analytic and vanishes at the cusps. Moreover, it is possible to work 
out F(z) from arithmetic information about the elliptic curve, namely, the 
number of solutions to the congruences y? = Az* + Bz? +Cx+D (mod p) 
for all primes p. It is this connection, rather than anything to do with Fer- 
mat’s Last Theorem, that makes the Taniyama—Shimura—Weil Conjecture 
so important. 


And now for the climax of this book: 


Theorem 14.23. (Fermat’s Last Theorem.) [If p is an odd prime then the 
Fermat equation 


xP + y? = 2P 
has no solutions in nonzero integers x, y, 2. 


Proof: The full proof can be found in Wiles [82] and Wiles and Taylor [75] 
and is several hundred pages long. We can, however, sketch how the proof 
follows from the concepts introduced above. For a longer, more technical 
sketch, see Ribet [60]. 


We aim for a proof by contradiction, and suppose that there exists a 
solution a? +b? = cP for nonzero integers a, b,c. Let F be the corresponding 
Frey elliptic curve. By Theorem 14.21 the curve F is modular, hence has 
a cusp form F of weight 2 and level N, where N is the conductor of F. 


Lemma 14.22 now allows us to invoke Serre’s Level Reduction Conjec- 
ture, proved by Ribet, and this implies that for any odd prime / dividing 
N there exists a cusp form F” of weight 2 and level N/Il, which inherits 
various useful properties of F'. (It would be too technical to say which). 
Inductively, we can consider an odd prime /’ dividing N/I and repeat the 
argument to get a cusp form of level N/Il’, and so on. The conductor is 
divisible by 2 since 6 is even, and by definition the conductor is a product 
of distinct primes. We may therefore remove all odd prime factors of N 
and deduce that there exists a cusp form of level 2. 


The dimension of the space of such cusp forms is equal to the genus of 
the compact Riemann surface Xo(N). But by direct calculation, the genus 
of Xo(2) can be shown to equal 0. (Indeed, the genus of Xo(N) is zero for 
N < 10.) That is, there are no cusp forms of weight 2 and level 2. This is 
a contradiction, and Fermat’s Last Theorem is therefore true. O 
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14.8 Recent Developments 


Wiles’s proof of Fermat’s Last Theorem is not important because it solved 
that problem and thereby closed down that line of research. On the con- 
trary, it is important because it opened up many new areas for future 
work. In this section we indicate some of these developments, including 
appropriate background material. 

Perhaps the most significant development of all occurred six years af- 
ter Wiles’s breakthrough, when Christophe Breuil, Brian Conrad, Fred 
Diamond, and Richard Taylor announced a proof of the full Taniyama-— 
Shimura—Weil Conjecture (Darmon [17]). Recall that Wiles required (and 
proved) only the semistable case of this conjecture. Their methods are 
firmly in the spirit of Wiles’s pioneering work, and we content ourselves 
with two observations. The first is that they prove a more general theorem 
than the Taniyama—Shimura—Weil Conjecture by rephrasing it in algebraic 
form. This more general conjecture has technical advantages, which make 
the proof possible. The second is that their methods make heavy use of Ga- 
lois Theory (Stewart [71]), which was introduced by Evariste Galois around 
1830 as a way to decide whether a polynomial equation can be solved in 
terms of radicals—expressions involving nth roots of algebraic formulas. 
One consequence of Galois’s work is a simple and conceptual proof that 
the quintic equation (the general polynomial equation of the fifth degree) 
cannot be solved by radicals—a theorem proved slightly earlier by Abel, 
using different methods. 

Fermat’s Last Theorem is just one of many questions in number theory 
about integer powers. The Catalan Problem of 1844 asks whether 8 and 9 
are the only consecutive perfect powers, that is, the only nontrivial solutions 
of the equation 


em —y"=1. (14.38) 
In 1976 Tijdeman [76] proved that there exist only finitely many consecutive 
perfect powers, and later work showed that the largest solution to (14.38) 
satisfies 
|z| < exp exp exp exp 730. 
More generally, we can consider the Diophantine equation 
x? + y? = 2°. (14.39) 


The ‘surprising’ solutions occur when a,b,c are ‘large’ in some sense. For 
the Pythagorean equation (a = 6 = c = 2), with its infinite family of 
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solutions, the exponents a,b,c should clearly be considered small. We ex- 
plain below why a sensible interpretation of the ‘size’ of a solution, in this 
context, is the number 

1 1 #1 

s=-+7+-. 

a bee 
The smaller s is, the larger a,b,c must be. The crucial distinction is that 
‘large’ solutions have s < 1 but ‘small’ ones have s > 1. The only known 
integer solutions of (14.39) for large a,b,c (see Mazur [51]) are: 


1+2? = 3? 
25477 — 34 
7 413? = 29, 
274177 = 71’, 


3° 4114 = 1227, 
17” + 762713 = 210639287, 
1414 + 2213459? = 657, 
9262° + 15312283? = 113”, 
43° + 96222? = 300429077, 
33° + 159034? = 15613°. 


By convention 1 is treated as 1° and + =0. Sos= & for the first 
solution above. 

The first five of these solutions have been known for centuries; the last 
five are due to Beukers and Zagier. The main conjecture here is: 


Conjecture 14.24. (Fermat—Catalan Conjecture.) In total, for all large 
(a,b,c) (that is, s > 1) there exists only a finite number of coprime integer 
solutions of Equation (14.39). 


The name of the conjecture is modern, due to Darmon and Granville: 
it reflects the fact that a positive solution would imply both the Catalan 
Conjecture and Fermat’s Last Theorem. 

The main positive result is a recent theorem of Darmon and Merel, who 
prove that there are no solutions with (a,b,c) = (g,g,3) for g > 3. Darmon 
and Granville have proved that for each individual triple (a, b,c) with s < 1 
there exist only finitely many coprime integer solutions x, y,z of (14.39). 
The Fermat—Catalan Conjecture says more than this: the number of triples 
(a,b,c) with s < 1 for which coprime integer solutions exist is also finite. 

In order to formulate conjectures and theorems in this area with greater 
precision, we require the following simple concepts: 
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Definition 14.25. Let N be an integer. Then the radical rad N is the 
product of all distinct prime factors of N. 
If N £0,+1 then the power function of N is 


ss ee IN| 
lograd N’ 
By convention, 
P(#1) ='oo. 


Obviously P(N) = 1 if and only if N is squarefree. If N is a perfect 
kth power, then P(N) > k. We also define an a-powered number to be a 
number N for which P(N) > a. Roughly speaking, the larger a becomes, 
the rarer a-powered numbers are. For example: 


Proposition 14.26. As x — oo the number of squarefree integers between 
0 and «x is of the form 


6 


Proof: See Exercise 8. O 


Informally, this proposition tells us that roughly 60% of integers (up to 
a given size) are squarefree. 

Fermat’s Last Theorem asserts that a particular ‘linear relation’ be- 
tween perfect nth powers is rare (indeed, so rare that it never happens). 
That is, the sum of two nth powers is never an nth power. More generally, 
we can look at linear relations between three a-powered numbers and ask 
how rare those are. To be precise, choose three real numbers a,b,c > 1 
and a real number x. Let S(a, b,c; x) be the set of all triples (A, B,C) of 
integers (assumed relatively prime and nonzero) such that 


|Al<a |B) <a |C\ <a, 
A+B+C=0, 
P(A) >a P(B)>b P(c)>c. 


Given a,b,c, how rapidly do we expect the cardinality of S(a,b,c;x) to 
grow as x — oo? That is, how rare (or how common) are solutions to 
A+B+C=0 in a,b, c-powered numbers A, B,C? 

A heuristic argument leads to a striking guess. Ignore the condition 
that A,B,C be relatively prime, because this does not change the likely 
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result much. Then there are roughly z!/¢ choices for A, x!/° choices for B, 
and x1/¢ choices for C. Since |A + B+C| < 32, the probability (whatever 
that means here) that A+ B+C = 0 is of the order + + 4+4-1, so 
the cardinality of S(a,b,c;x) should be comparable to 2+#+2-!, This 
argument is very rough-and-ready, but it focuses attention on the basic 
exponent 

5 cee eae b 


d=—-+-=+--1l=s-1 
a b ec 


and leads us to consider three different cases: 
d<0 d=0 d>0 


where we expect very different results. Roughly speaking, if d > 0 then 
we expect a proportion x? of solutions with |A],|B|,|C| < 2, so that in 
particular if we allow x to take on all values, then we expect there to exist 
infinitely many solutions to A+ B+C = 0. When d < 0, however, we 
expect there to be only finitely many solutions (allowing x to range over 
all values). When d = 0 we have a delicate transitional case and caution is 
needed even in formulating a sensible conjecture. 

Suppose that a <b <ceéN. Then we can tabulate all cases for which 
d > 0 (Table 14.1). This table is familiar from other areas of mathematics, 
notably polyhedra and tilings in higher dimensions, which raises the hope 
that the approach being adopted here is significant. 

When d > 0 and a,b,c € N we can get large numbers of (a,b,c’) 
solutions with c’ close to c from single Diophantine equations, such as 
2° + y = Ez° for some fixed integer E # 0. Thus, for instance, to obtain 
lots of (2,2,c) solutions we might consider x? + y* = z°. When c = 2 
the problem is that of Pythagorean triples, and these occur in sufficient 


* 
* 
iL 
6 

at. 

12 

L 

30 
0 
0 


Table 14.1. Integer a, b,c for which d > 0. 


14.8. Recent Developments 267 


profusion to establish the conjectured asymptotics, in the sense that the 
number of solutions with |A],|B|,|C| < x is at least x?-¢ for any € > 0, 
however small. 

When d < 0 the problem becomes, if anything, more interesting. We 
may state the following: 


Conjecture 14.27. ((a,b,c) Conjecture.) If++4++4 <1 then the number 
of solutions A, B,C of the equation A+ B+ C = 0 with P(A) > a, P(B) > 
b, P(C) > ¢ is finite. 


Indeed, there is a stronger conjecture: 


Conjecture 14.28. (Uniform (a,b,c) Conjecture.) Let dy < 0 be real. 
If 4 + t + 4 < 1 then the number of solutions A,B,C of the equation 
A+B+C =0 with P(A) >a, P(B) > 6, P(C) >c andi +2414-1<d 
is finite. 


The Uniform (a, b,c) Conjecture implies the (a, b,c) Conjecture. At the 
time of writing, neither conjecture has been proved in any case whatsoever! 
However, both would be consequences of the so-called ABC-Conjecture of 
Masser and Oesterlé [46, 56], one of the biggest open questions in current 
number theory. We state this conjecture after setting up one necessary 
concept. 

Define an ABC solution to be a triple (A,B,C) of nonzero coprime 
integers such that A+ B+C =0. Define the power of (A, B,C) to be 


EEO) = log rad (ABC) 


Then the conjecture is: 


Conjecture 14.29. (Masser and Oesterlé ABC-Conjecture.) For any real 
p> 1 there exist only finitely many ABC solutions with P(A, B,C) > p. 


Finally, we introduce the Beal Conjecture: see also Mauldin [47]. An- 
drew Beal is a number theory enthusiast living in Dallas, Texas. When 
Fermat’s Last Theorem was proved, he decided to follow the example of 
Paul Wolfskehl. He offered a prize of $5,000 (increasing annually by $5,000 
up to a total of $50,000) for a proof of: 


Conjecture 14.30. (Beal Conjecture.) Let x,y,z, a,b,c be positive inte- 
gers with a,b,c > 2. Ifx*+y? = 2°, then x,y, z have a common factor > 1. 
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This conjecture is currently open. All ten ‘large’ solutions of x?+y® = 2° 
have one exponent equal to 2, so the Beal Conjecture is consistent with the 
known data related to the Fermat—Catalan Conjecture. It is clear that 
plenty of unsolved questions about sums of powers remain to keep the next 
generation of number theorists busy. 


14.9 Exercises 


1. Prove that 


s(—u) =—s(u), ¢(—u) = e(u) 


for all u for which s(u),c(u) are defined. 
(Hint: s(—u) and c(—u) satisfy the differential Equation (14.9), hence 
are of the form As(u) + Bc(u) for constants A,B. Which?| 

2. Prove that the map 2 : R > S$! defined by Q(u) = (c(u), s(u)) is 
onto. 
[Hint: show that the image of c is the interval [—1,1] and that both 
signs for ,/1— c?(u) can be realised by s(u), s(—u).] 


3. Prove that the series 
1 
/ 
g'(z) = —2 5 
1EL (2—1) 


and 


1 1 1 
e(z)= at > Gp 
leL\o 
are absolutely convergent provided z ¢ L. 


[Hint: Break the sum up in terms of lattice points that lie on the 
parallelograms P; with vertices +2jw, + 2jw2. Estimate the sums 
over the points in these parallelograms.] See Hancock [33] page 311. 


4, Let 


az+b 
f(z) = cz+d 
ge Az+B 
nee Cz+D 


14.9. 
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be Mébius maps, so that ad — be 4 0, AD — BC # 0. Show that 


(Aa + Be)z + (Ab + Bd) 


IF (z)) = (Ca + De)z+(Cb+ Dd) 


Deduce that the set of M6bius maps forms a group under composition. 


. Continuing Exercise 3, if ad — be = 1 and AD — BC =1, prove that 


(Aa + Bc)(Cb + Dd) — (Ab+ Bd)(Ca+ Dc) = 1. Deduce that the 
modular group really is a group. 


. Prove that the modular group is isomorphic to 


PSL2(Z) = SL2(Z)/Z. 


. Let S,T be as defined in (14.36). Prove that S$? = —I, T has order 


oo, and ST has order 3. 


. Give a heuristic argument to show that the number of squarefree 


integers less than x is equal to Sa + O(,/z) for « > oo. 


[Hint: Let p; be the primes in increasing order and consider the 
sequence of integers 1, 2,3,...,2. Remove from this sequence all mul- 
tiples of p?, leaving approximately (1 — or) integers. From these, 


remove all mutiples of p32, then p2, and so on. Continue until p; ~ /z, 
so that the number of integers left (which are the squarefree ones) is 
approximately 


z ][a-s 


P3SVz PP 


Now use Euler’s result that 


and estimate errors.] 
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A 


Quadratic Residues 


The theory of quadratic residues is one of the great triumphs of the classical 
period of number theory. An integer k which is prime to a positive integer 
m. is said to be a quadratic residue modulo m if there exists z € Z such that 


2=k (modm). 


Denoting the residue class of k modulo m by k, this can be rephrased as: 
k is both a unit and a square in Zm. 

We shall investigate the question of quadratic residues by determining 
the structure of the units in Z,,. We shall also show how knowledge of 
quadratic residues solves the more general question of finding solutions of 
the quadratic equation 


ax? +b +c=0 
in Zm- 


The most remarkable theorem about quadratic residues is known as the 
quadratic reciprocity law which states: 

If p, g are distinct odd primes, at least one of which is congruent to 1 
modulo 4, then p is a quadratic residue of q if and only if g is a quadratic 
residue of p; otherwise precisely one of p, g is a quadratic residue of the 
other. 

The reciprocal nature of the relationship between p and gq in the first 
case gives rise to the name of the law. Gauss first proved it in 1796 when he 
was but eighteen years old. The result had been conjectured earlier by Euler 
and Legendre, though Gauss said that he did not know this at the time. 
He thought so highly of it that he called it ‘the gem of higher arithmetic’ 
and developed six different proofs. In the 19th century it continued to 
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arouse interest, with more than fifty different methods of proof from such 
as Cauchy, Eisenstein, Jacobi, Kronecker, Kummer, Liouville and Zeller. In 
fact it was when Kummer was studying this and certain higher reciprocity 
laws that he came upon his partial proof of Fermat’s Last Theorem. In 
1850 Kummer referred to the higher reciprocity laws as the ‘the pinnacle 
of contemporary number theory’, regarding Fermat’s Last Theorem as a 
‘curiosity’ (see Edwards [21]). It is only right and proper, therefore, that 
any text on Fermat’s Last Theorem should include a description of the 
result that so fascinated number theorists and whose study led to Kummer’s 
proof of a special case of Fermat’s Theorem as a by-product. 


A.1 Quadratic Equations in Z,, 


An obvious topic of number-theoretic study is the solution of polynomial 
equations modulo a positive integer m. 
A linear equation 


az+b=0 (modm) (A.1) 
can clearly be solved when a is prime to m, because there exist integers 
c, d such that 

ac+dm=1. 
Whence multiplying (A.1) by c and simplifying gives 
z= —be (modm). 


If, on the other hand, a, m have a highest common factor h, then (A.1) can 
only have a solution if h divides 6, in which case, writing a = agh, 6 = doh, 
m = moh, we can reduce (A.1) to 


aor + bp = 0 (mod mp) 


where once more @p, Mp are coprime. Thus the solution of linear equations 
modulo m is straightforward. 
Quadratic equations are more interesting. The equation 


az’ +br+c=0 (modm) (A.2) 


where m does not divide a may be simplified by multiplying throughout 
(including the modulus) by 4a and completing the square to get the equiv- 
alent equation 


4a? x” + dabz +4ac=0 (mod 4am) 
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or 
(2az + b)? =b? —4ac (mod 4am). 


Now substituting 4am = mo, 2az + b= z mod mp and b? — 4ac = k mod 
mo , we replace (A.2) by the two equations: 


z7=k (modmo) (A.3) 
2az +b =z (modmo) (A.4) 


If we find z from (A.3) we can then attack (A.4) by the given method for 
linear congruences, so the solution of the general quadratic (A.2) reduces 
to solving (A.3). We can further reduce this if k, mp are not coprime by 
supposing they have highest common factor h where k = koh and mp = 


myh, and then factorizing h as 
h=er 


where e? is the largest square factor of h. Then for (A.3) to have a solution 
for z we must have er as a factor of z. Let z = erw; then 


e?r?w? = kge*r_ (mod me?r) 


so 
rw? =kg (modm). (A.5) 


Now suppose that the highest common factor of r, m1 is s. For (A.5) to 
have a solution we must have s as a factor of kp. But ko, m, are coprime, 
so s = 1 andr, m, are also coprime. Thus there exists an integer d prime 


to m, such that 
dr=1 (modmy), 


so multiplying (A.5) by d and simplifying gives 
w* =dky (modm)). (A.6) 


But d and ko are both prime to m1, so putting dkp = kj, the solution of 
the general quadratic equation (A.2) reduces to linear congruences and 


w?=k, (modm) 


where k,, m, are coprime. 

If m > 1 and k are integers, recall that k is a quadratic residue modulo 
m if 

(i) k, m are coprime 

(ii) There exists w € Z such that 


w*=k (modm). 
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If & is the residue class of k in Z,,, then these conditions are equivalent to 
(i) k is a unit in Zm, 
(ii)! #7 =k in Zm, 
so that & is both a unit and a square in Zm. We shall attack the problem 
of finding quadratic residues by first computing the structure of the units 
in Zp. 


A.2 The Units of Z,, 


An element k € Zm is a unit if and only if k, m are coprime, so the units 
U(Zm) of Zp are given by 


U(Zm) = {k € Zm|1 < k < m;k,m coprime}. 


The number of elements in U(Z,,) is called the Euler function ¢(m) and 
is equal to the number of positive integers k less than m and prime to it. 
For example ¢(10) = 4 since 1, 3, 7, 9 are the integers between 1 and 10 
and prime to 10, and there are four of them. For later reference we record: 


Lemma A.1. If p is prime, then $(p*) = p*-1(p — 1), and in particular 
o(p) =p-—1. 
Proof: There are p® — 1 elements satisfying 1 < k < p®, and of these, if 
k=rp, then 

l<rp<p* 


implies 
1<r<p™}, 


so there are p°-! — 1 elements not prime to p, giving 


o(p°) = (p* — 1) - p** - J). Oo 


The units U(Z,,,) form a group under multiplication whose structure we 
shall compute. First we factorize m = p{'...p& where pi,... ,p, are dis- 
tinct primes and reduce the problem to considering each prime separately. 
For typographical reasons we write p;* = P;. 


Lemma A.2. If m= pj)... p&" where p1,...,p, are distinct primes, then, 
writing p;* = P;,, there is a ring isomorphism 


Zm = Zp, xX...X Zp. 
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Proof: Define 7: Z—> Zp, K.ee XK Zp. by m(k) = (ki, . as kr) where ky is 
the residue class of k modulo P, = pj. Clearly 7 is a ring homomorphism 
and k € ker 7 if and only if P, divides k for every i, which implies ker 7 = 
(m) (the ideal generated by m). Thus 7 induces a monomorphism 


T:Zm—7> Zp, x...xX Zp. 


But Zp, x... xX Zp has P, x... x P, = m elements, so 7 is in fact an 
isomorphism. Oo 


Lemma A.3. With the above notation, 
U(Zm) = U(Zp,) KX... X U(Zp.). 


Proof: Under the ring isomorphism 7, an element k € Zm is a unit if and 
only if 7(k) is a unit, which holds if and only if each of its components is 
a unit. Oo 


Lemma A.3 reduces the study of units in Z,, to the case of Ze where 
p is prime. To tackle this case we begin with e = 1. Here we shall see that 
U(Zp) is a cyclic group of order p—1. It might seem that the easiest way to 
show this would be to find a generator, but to give an explicit construction 
for one proves to be moderately intricate. So we attack the problem indi- 
rectly by introducing an auxiliary notion which will prove useful in several 
ways: 

The exponent h of a finite group G is defined to be the smallest positive 
integer such that (in multiplicative notation) 2” = 1 for all z € G. By 
Lagrange’s Theorem, 2” = 1 where n is the order of G, so clearly h < n. 
Also, for any z € G, if x has order k, then k divides h. An alternative 
definition of the exponent is, therefore, the least common multiple of all 
the orders of the elements in G. 

We intend to establish that in an abelian group (such as U(Z,)) there 
actually exists an element 29 of order h. This will then establish two facts; 
first that (by Lagrange’s Theorem) the exponent h divides the order of G; 
and second, if we can demonstrate that the exponent equals the order, then 
G is cyclic with generator zo. 

To demonstrate the existence of such an zo, we begin with: 


Lemma A.4. In an abelian group G, if a, b have finite orders q, r which 
are coprime, then ab has order qr. 


Proof: (ab)2” = a!"b?” = 1. Now suppose (ab)* = 1, then 


a’ =b* 
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so the elements a* and b~* must have the same order k; however the order 
of a® divides the order of a, so k divides q, similarly k divides r. Since q,r 
are coprime, k = 1, which implies 


a&=b *=1, 


whence q divides s, r divides s, and since g, r are coprime, qr divides s, 
completing the proof. O 


Lemma A.5. If the finite abelian group G has exponent h, then there exists 
Zo € G such that the order of xo is h. 


Proof: Let p* be the highest power of a prime p dividing h. It is easy 
to see that G must have an element x of order p*r where r is prime to p 
(for if not, the highest power of p dividing the order of every element is 
p*-} or less, contrary to the definition of h). The element y = x” is then 
of order p*. Find elements y for all distinct primes p dividing h and then 


use Lemma A.4. oO 
Proposition A.6. U(Zp) is a cyclic group of order p — 1. 


Proof: The order of U(Zp) is ¢(p) = p—1 by Lemma A.1. To show U(Z,) 
is cyclic by Lemma A.5, we need only verify that the exponent h of U(Z») 
is p—1. Certainly h < p—1. Conversely, every element of U(Z,) satisfies 
gh — 1 by definition, and interpreting z* — 1 = 0 as a polynomial equation 
over Z,, this has, at most, h roots, hence p—1 < h. O 


Examples. 


U(Z2) = {1}, generator 1, 
U(Z3) = {1,2}, generator 2, 
U(Zs) = {1,2,3,4}, generators 2, 3, 


If 5 is a generator of U(Z,), then s € Z is called a primitive root modulo 
p. Such primitive roots will play a central part in our computations, since 
every element of U(Z,) is of the form 8” where s is a primitive root modulo 
p. If we can find a primitive root, then because the order of U(Z,) is even, 
the even powers 3?” are clearly quadratic residues and the odd powers s2"*+1 
are not. In general we shall not attack the problem this way (because we 
do not know the value of s), but it illustrates the theoretical importance of 


primitive roots. We isolate two properties which will prove essential later: 
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Lemma A.7. 

(i) If s is a primitive root modulo p, then so is s” if and only ifr, p—1 
are coprime. 

(ii) If s is a primitive root modulo p and k is a positive integer, then there 
is another primitive root X modulo p such that s = * 


Proof: (i) If r, p—1 are coprime, then there exist integers a, b such that 
ar + b(p—1) = 1, so ga rH) gre, 
and 8” generates U(Zp). 

Conversely, if r, p—1 were to have common factor d > 1, where p—1 = 


dq, r = dc, then (37) = gle — gP—Ne 1, 


so 8” cannot generate U(Z,). 

(ii) Since p*, p—1 are coprime, there exist integers a, b where a is prime 
to p—1 such that is 
ap” + b(p—1)=1. 

Hence A = s° is a primitive root modulo p by part (i) and 


xe" — gor* _ gap*+0(p—1) _ 5 O 


We are now in a position to describe the structure of U(Zye) for prime 
p, which we shall do by explicit computation, first for an odd prime. 


Proposition A.8. If p is a prime, p # 2, e > 2, then U(Zpe) is a cyclic 
group of order p°—\(p—1) with generator 5(1+ p), where 3 is a primitive 
root modulo p. 


Proof: Since the order of U(Zpe) is p*—'(p—1) by Lemma A.1, and p—1, 
p°—' are coprime, then using Lemma A.4 it is sufficient to show 3 has order 
p—1and1-+p has order p*—! in U(Z,e). In the case of 3 we have 


gel — Pp *@-1) using Lemma A.7 (ii) 
1(mod p*) by Lagrange’s theorem. 


On the other hand 
s"=1 (modp*) 
implies 


s”=1 (modp) 
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and since s is a primitive root modulo p we have 
s°#1 (modp*) forl<r<p-l, 


demonstrating that 5 is of order p—1 in U(Z,-). To prove that 1+ p is of 
order p*—!, we establish by induction on e > 2 that 


(1+-p)”° =1+kp*? (modp*) (A.7) 


where k depends on e, but k 4 0 (mod p). For e = 2 this is true with 
k =1. Assume it true for some e > 2. Then 


e—2 


(l+p)P 0 = 1+kp** +rp° 
=1+4+ sp?! 
where s =k +rp £0 (mod p). 
Hence 
(+p) = (1+ sp)? 


1+ psp? + (3) s?pe-)) Se gP pre) | 
For e > 2 and prime p # 2 this is of the form 
1+ sp° + bp**?, 
(For p = 2 this only breaks down when e = 2.) Hence 
(1+ p)" =1+4sp° (modp**) (A.8) 


where s £ 0 (mod p), completing the induction proof of (A.7). 
Then (A.7) implies 


(I+p)” 41inZ, 
and (A.8) implies 
(I+p)” =Tin Zp, 
which together show 1 + p is of order p®—! in U(Zpe). Oo 


Since the proof in Proposition A.8 breaks down for p = 2, we must treat 
this case separately. We find: 
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Proposition A.9. U(Z4) = {1,—1} is cyclic with generator —1. For e > 3, 
U(Zee) is not cyclic, but —1 is of order 2, 5 is of order 2°-? and U(Zze) 
is the direct product of the cyclic groups generated by —1, 5. 


Proof: The assertion concerning U(Z,) is trivial. For e > 3, by Lemma 
A.1, the order of U(Zge) is 2°-1. Clearly the order of —1 in U(Zze) is 2. 
For the element 5 we note that by induction on e > 3 we may establish 


ge-3 


Be = (14 27)?" = 142-1 (mod 2°). 


Hence 52° ° £1 in Zoe, but 
57" = (142°)? =1 (mod2°) 


which demonstrates that 5 is of order 2°—? in U(Zze). 
Now —1 is not a power of 5 in U(Zze) since 


-1#5" (mod4), 


so certainly 
—-1£5" (mod2°). 


Hence if C is the cyclic subgroup generated by 5, the cosets C, —1C are 
disjoint. But the index of C in U(Zze) is 2°-1/2°-? = 2, so these two 
cosets exhaust U(Zee). Thus every element of U(Zze) is uniquely of the 
form (—1)*5° where a = 0 or 1 and 0 < b < 2°”. Since multiplication 
is commutative, U(Z2-) is the direct product of the cyclic subgroups gen- 
erated by —1, 5. The exponent of U(Ze-) is 2°-? which is less than the 
order, so U(Zze) cannot be cyclic. 0 


Having described the structure of U(Zpe), we are now in a position to 
investigate quadratic residues. 
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As in the last section we can speedily reduce the problem of finding residues 
modulo m to the case of prime powers: 


Proposition A.10. If m = p{'...p& where pi,...,pr are distinct primes 
and k is relatively prime to m, then k is a quadratic residue modulo m if 
and only if it is a quadratic residue of p;* for 1 <i<r. 
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Proof: Using the isomorphism 7 : U(Zm) — U(Zp,) x ... x U(Zp,) of 
Lemma A.3, k is a square if and only if each component of 7(k) is a square, 
and the ith component is the residue class of k modulo pf. 


This reduces the general problem of finding quadratic residues to the 
simpler problem of finding quadratic residues modulo a prime power. Fol- 
lowing the last section we distinguish between the case of an odd prime 
and p = 2 first, because this can be given an immediate answer: 


Proposition A.11. The odd integer k is a quadratic residue modulo 4 if and 
only if k = 1(mod 4), and is a quadratic residue modulo 2° for e > 3 if and 
only if k = 1(mod 8). 


Proof: Since U(Z4) = {1,3} the only square in U(Z,4) is 1. For e > 3, if 
2" = k in U(Zee), we use Proposition A.9 to write 
%= (—1)*5’, k= (—1)°5%, 
then Sean near 
(—1)?25 oa (—1)°52, 


whence c is even and 2b = d (mod 2°~?). Given d, the congruence can be 
solved for b if and only if d is even. Thus k is a quadratic residue modulo 
2° if and only if 


k=54 
in Zoe where d is even. Putting d = 2r this implies 


k=57" (mod 2°) 


for e > 3, hence 
k = 25" (mod 8) 


1 (mod 8). 


Conversely if k = 1 (mod 8) and k = (—1)°54 (mod 2°), then 
(—1)°54=1 (mod8). 


This can only happen when c, d are even and then (—1)°5? is a square in 
U(Zze). 


In the case p odd we first characterize the quadratic residues modulo p 
by using a primitive root modulo p: 
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Lemma A.12. If s is a primitive root modulo p, then k = s* is a quadratic 
residue if and only if a is even. 


Proof: If a = 2b, then k = (s°)?. Now 5 has even order p — 1, it cannot 
be a square, nor can s® for a odd. oO 


This characterization of quadratic residues modulo p immediately gives 
us: 


Proposition A.13. If p is an odd prime, then k is a quadratic residue 
modulo p® for e > 2 if and only if k is a quadratic residue modulo p. 


Proof: If z? =k (mod p°), then clearly z? = k (mod p), so a quadratic 
residue modulo p® also serves modulo p. Conversely, suppose k is a quadratic 
residue modulo p. By Proposition A.8 we can write k = s*(1+ p)* (mod 
p°), and reducing this modulo p gives k = s* (mod p), so Lemma A.12 tells 
us a = 2b for an integer b, whence k = [s?(1 + p)’]? (mod p*) and k is a 
quadratic residue modulo p*°. 


This leaves us with the central core of the problem: determining quad- 
ratic residues modulo an odd prime p. Legendre, who published two vol- 
umes on number theory in 1830, introduced a deceptively simple notation 
which is ideally suited to the task. He defined the symbol (k/p) for an odd 
prime p and an integer k not divisible by p as 


(k/p) = +1 if k is quadratic residue modulo p 
P) ~ ) —1 otherwise. 


The value of this notation can be seen by writing k = 5* where s is a 
primitive root modulo p. By Lemma A. 12, k is a quadratic residue modulo 
p if and only if a is even, hence 


(k/p) = (-1)*. 


From this it is easy to deduce the following useful properties: 


Proposition A.14. 
(i) k =r (mod p) implies (k/p) = (r/p), 
(ti) (kr/p) = (k/p)(r/p). 


Proof: (i) is immediate, and (ii) follows by writing k = 5°, 7 = 5°, whence 
kr = 5*+° and 


(kr /p) = (-1)°*? = (-1)*(-1)? = (k/p)(r/p). O 
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It is now possible to give a computational test for quadratic residues: 


Proposition A.15. (Euler’s Criterion.) For an odd prime p and an integer 
k not divisible by p, 


(k/p) =k®-)/2 (modp). 


Proof: For a primitive root s modulo p we have s?~1 = 1 (mod p), and 
since p — 1 is even, 


(s-V/2 _ 1)(s(-)/2 4.1) = (s?-1-1) =0 (modp). 
Because s‘—1)/2 £1 mod p, we deduce 
s-1)/2 =1 (modp). 
Hence, writing k = s* as before, we have 


(k/p) = (—1)* 
= (s®-1/2)2 (modp) 
= (92) (P-D/2 


= k-1)/2 (modp). oO 


Example A.16. k is a quadratic residue mod 5 if k? = 1 (mod 5), giving 
k =1,4. 


We soon see the weakness in this criterion if we attempt to find the 
quadratic residues modulo a larger prime, for example p = 19. In this case 
k is a quadratic residue if and only if k® = 1 (mod 19), and the calculations 
concerned involve more work than just calculating all the squares of ele- 
ments in U(Zig9) and solving the problem by inspection. However, Euler’s 
criterion can be used to deduce a much more useful test, due to Gauss. 

What Gauss did was to partition the units modulo p by writing them 
in the form 


U(Zp) = {== 1)/ 2,55 fe fe 0) Oy eee »(p —1)/2} 
=NUP 


where 


N={-@— 1/2, -«. ;—2,-1} 


P ={1,2,...,(p—1)/2}. 
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For instance, if p = 7, then 
N = {-3,-2,-I}, P= {1,2,3}. 


Using the usual multiplicative notation aS = {as|s € S}, we can write 
N = (—1)P. To find out if k is a quadratic residue, Gauss computed the 
set kP and proved: 


Proposition A.17. (Gauss’s Criterion.) With the above notation, if kPA.N 
has v elements, then (k/p) = (-1)’. 


Proof: Since k is a unit, the elements of kP are distinct and so |kP| = |P|. 
Furthermore if @, 6 are distinct elements of P, then we may take 0 < a < 
b < (p—1)/2. We cannot have ka = 7 and kb = —7, for that implies 

k(a + 6) is divisible by p, hence a+ 6 is divisible by p, contradicting the 
inequalities satisfied by a, b. Thus the elements k,k2,...,k(p— (p — 1)/2 of 
kP consist precisely of the elements +1, +2,... ,+(p—1)/2 72, 2, possibly in a 
different order, where the number of minus seus is the rmnber of elements 
of KP in N. Hence 


k-k2...k(p—1)/2 = (+1)(+2)... (4(p— 1)/2) 


- RP-1/2 = (-1)” 


where v is the number of elements in KPN N. 
Thus 

k(e-)/2 = (-1)” (mod p) 
and Euler’s criterion gives 


(k/p) = (-1)’. Oo 


Example A.18. Is 3 a quadratic residue modulo 19? To answer this we 
calculate 3P = {3, 6, 9,12, 15, 18, 2,5, 8} 


= {3, 6,9, 2, 5,8} U {—7, —4, -1}, 


so y = 3 and Gauss’ criterion tells us that 3 is not a quadratic residue 
modulo 19. 


These two criteria take us further in the search for quadratic residues 
k modulo an odd prime p, for by factorizing 


k = (—1)°2°p@ .. .p® 


286 A. Quadratic Residues 


then k is certainly a square if a, b,e,,... ,e, are even and will be a quadratic 
residue if the factors with odd exponent are quadratic residues. Thus the 
question of quadratic residues is finally reduced to the question of deter- 
mining whether —1, 2 or an odd prime q (distinct from p) are quadratic 
residues modulo an odd prime p. The given criteria solve the question for 
—I, 2: 


Proposition A.19. 

(i) (—1/p) = (—1)®-)/2, so —1 is a quadratic residue modulo p if and only 
if p =1 (mod 4). 

(ii) (2/p) = (-1)®’-))/8, 50 2 is a quadratic residue modulo p if and only 
if p= + 1 (mod 8). 


Proof: (i) is a trivial consequence of Euler’s criterion. 

(ii) 2P =2,4,... ,p—I, so |2PN N| is v = (p—1)/2—r where r is the 
largest integer such that 2r < (p — 1)/2. 

Case 1. (p — 1)/2 is even and 2r = (p—1)/2, whence v = (p— 1)/2 — 
(p —1)/4 = (p—1)/4. Thus (2/p) = (-1)®-D/4. 

Case 2. (p—1)/2 is odd and 2r = (p—1)/2—1, so that v = (p—1)/2-— 
(p— 1)/4+ 4 = (p+1)/4. Thus (2/p) = (—1)+)/4. We can put these 
two cases together by noting that in the first case (p — 1)/2 is even if and 
only if (p+ 1)/2 is odd. Raising (—1)" to an odd power does not change 
it, so in case 1 


(2/p) = [(-1)@-D/J@+D/2 = (1) -D/8, 
Case 2 gives the same result by raising to the odd power (p — 1) /2. Oo 


We can see by the work required in case (ii) of this proposition that the 
Gauss criterion is still not subtle enough to decide easily whether an odd 
prime q is a quadratic residue modulo p. To approach this final problem in 
quadratic residues the criterion is further refined to give Gauss’s ‘gem of 
higher arithmetic’: 


Theorem A.20. (Gauss’s Quadratic Reciprocity Law.) If p, q are distinct 
odd primes, then 


(p/q)(a/p) = (-1)®@ VEYA, 


Proof: (fough but worth if.) By the Gauss criterion, 


(a/p) = (-1)” 
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where v is the number of integers a in 1 < a < (p—1)/2 such that there 
exists an integer b satisfying 


ag=bpt+r, —p/2<r<0. 


There can be at most one such integer b for each a, so we can rephrase 
the requirement: v is the number of ordered pairs (a, b) of integers 
satisfying 


1<a<(p—1)/2, (A.9) 
—p/2 < aq — bp < 0. (A.10) 


Now from (A.9) and (A.10) we can deduce 
bp < aq + p/2 < (p—1)q/2+ p/2 < pq/2 + p/2 = p(qt+ 1)/2. 
Hence b < (q+ 1)/2, and (A.10) implies b > 1, so we have 
1<b<(q-1)/2. (A.11) 


Since (A.9) and (A.10) imply (A.11), it does no harm to add (A.11) to the 
list of requirements to be satisfied by the ordered pair (a, b), so that v is the 
number of pairs (a, b) of integers satisfying (A.9)-(A.11). It actually does 
a lot of good because of the symmetry of (A.9) and (A.11). Interchanging 
p, g, and a, b, we also have 


(p/q) = (-1)* 


where yz is the number of ordered pairs of integers (a, 6) satisfying (A.11), 
(A.9) and 


—q/2 < bp —aq <0 (A.10)' 
which can be written 
0 < aq — bp < q/2. (A.12) 


Since p, g are distinct primes, (A.9) and (A.11) imply ag— bp #0, sov+yp 
is the number of ordered pairs of integers (a, b) satisfying (A.9), (A.11) 


and 
—p/2 < aq— bp < q/2. (A.13) 


Now 


(p/a)(4/p) = (-1)"**, (A.14) 


so we only really require to know v + yu mod 2. 
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Let 
R= {(a,b) € Z7|1 <a< (p—1)/2,1<b< (q—1)/2}. 
Then R has (p — 1)(q — 1)/4 elements. Now partition R as follows: 


{(a,b) € Rlag — bp < —p/2} 
{(a,6) € R| — p/2 < aq — bp < q/2} 
{(a,b) € R\g/2 < ag — bp}. 


Ry 
Re 
Rg 


Then Rp is the set of solutions of (9), (11), (13) as required. The map 
f : Z? > Z? given by f(a,b) = ((p+ 1)/2 — a, (q+ 1)/2 — 5) is easily seen 
to restrict to a bijection from R, to Rg (check it!), hence |R1| = |R3|. This 
implies 


|R| = |R,| + |Rel + |Rs| 
=|R,| (mod2) 


= (p—1)(q—1)/4=v+ pu (mod2). 


From (A.14) we see 


(p/q)(q/p) = (-1)P VEYA, Oo 


As an immediate deduction we have the reciprocity law in the form 
given by Gauss: 


Theorem A.21. If p and q are distinct odd primes, at least one of which is 
congruent to 1 modulo 4, then p is a quadratic residue of p if and only if q 
is a quadratic residue of p; otherwise if neither is congruent to 1 modulo p 
then precisely one is a quadratic residue of the other. 


Proof: Ifat least one of p, g is congruent to 1 modulo 4, then (p—1)(qg—1)/4 
is even, so 


(p/a)(4/p) = 1, 


whence (p/q) = +1 if and only if (q/p) = +1. If neither is congruent to 1 
modulo 4, then (p — 1)(q — 1)/4 is odd, 


(p/q)(a/p) = —1, 


so precisely one of (p/g), (¢/p) is +1 and the other is 1. O 
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We can imagine Gauss’s intense pleasure at discovering this remarkable 
result. To see its power, we only have to compare the easy way this resolves 
problems involving quadratic residues compared with the two criteria given 
earlier or with ad hoc methods. Its use is even more clear when allied with 
Legendre’s clever symbol. 


Example A.22. Is 1984 a quadratic residue modulo 97? 


(1984/97) = (44/97) = (2/97)?(11/97) = (+1)?(11/97) 
= (11/97) = (97/11) = (9/11) = (3/11)? =1 


(because 97 = 1 mod 4). Hence 1984 is a quadratic residue modulo 97. 


By putting together the appropriate results of this chapter the question 
of whether a specific number k is a quadratic residue modulo m may be 
completely solved by a succession of reductions of the problem: 

(i) By factorizing m and using Proposition A.10 which says k is a 
quadratic residue modulo m if and only if it is a quadratic residue modulo 
each prime factor of m. 

(ii) If 2 is a prime factor, that part of the problem may be solved by 
Proposition A.11: if 2° is the largest power of 2 dividing m, for e = 1, any 
odd k is a quadratic residue, for e = 2 we must check that k = 1 (mod 4) 
and for e > 3 we must check that k = 1 (mod 8). 

(iii) For an odd prime factor p of m, we calculate (k/p). First we can 
reduce k modulo p so that we may assume 1 < k < p. Then we factorize 
k = qi'...q% where the q; are primes and write 


(k/p) = (q1/p)** ... (as/p)*. 


We only need consider (q;/p) for f; odd and since q; < p, we can use Gauss 
reciprocity to obtain 


(a:/p) = (p/a:)”(a/) = (p/q:)(-1)P YOY 


This reduces the problem to calculating (p/q;) where p > qi, and reducing p 
modulo q;, we then have to calculate Legendre symbols for smaller primes. 
Successive reductions of this nature lead to a complete solution. 


Example A.23. Is 65 a quadratic residue modulo 124? 

Since 124 = 2? x 31, we must check if 65 is a quadratic residue modulo 
2? and 31. Modulo 2? we have 65 = 1 (mod 4), so the answer is yes, by 
Proposition A.11 (i). Modulo 31 we have 


(65/31) = (3/31) = (31/3)(—1)?/4 = — (31/3) = -(1/3), 
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and 1 is a quadratic residue modulo 3, so (65/31) = —1. Thus 65 is not a 
quadratic residue modulo 124. 

It is the reduction of the seemingly complicated problem of quadratic 
residues to simple arithmetic such as this which highlights the brilliance of 
the jewel in Gauss’s number-theoretic crown. 


A.4_ Exercises 


1. 


Solve the following congruences (where possible) 
(i) 3a = 14 (mod 17), 

(ii) 62 = 3 (mod 35), 

(hi) 3c = 13 (mod 18), 

(iv) 202 = 60 (mod 80). 


. Solve the quadratic congruences (where possible): 


i) 3x? + 62 + 5 
ii) 7 +52+3=0 (mod 4), 


( 0 (mod 7), 
( 
(iii) z? = 1 (mod 12), 
( 
( 


Ill 


iv) «? =0 (mod 12), 
v) x? = 2 (mod 12). 


. Let a1,@2,...,@n, be the complete set of residues modulo n (not in 


any specific order), and let 6 be an integer relatively prime to n and 
c any integer. Show that 


a,;b+c,agb+c,...,@nb+e 


is also a complete set of residues. 


. Calculate the Euler function ¢(n) = |U(Z,)| for n = 4, 6, 12, 18. 
. Determine all the generators of U(Z7), U(Zi1), U(Z17). 


. Show that there exist primitive roots s modulo p such that 


s?-141 (modp’). 


. Calculate the exponents of the following groups: U(Z4), U(Ze), U(Zs), 


U(Zio). Which of these groups are cyclic? 
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8. 


10. 


18. 


Show that for an odd prime p there are as many square residue classes 
as non-squares in U(Zpe) for e > 1. 


. Show that in U(Zee) there are exactly 2°—-* squares and 3 - 2°—3 non- 


squares for e > 3. 


Determine the squares in the following groups: U(Z7), U(Z12), U(Zag). 


. Use Euler’s criterion to check whether 7 is a quadratic residue modulo 


23. Answer the same question using Gauss’s criterion. Now calculate 
(7/23) using Gauss reciprocity. 


. Compute the following Legendre symbols: (—1/179), (6/11), (2/97). 
. Compute (97/1117), (2437/811), (23/97). 

. Is 1984 a quadratic residue modulo 365? 

. Is 2001 a quadratic residue modulo 1820? 

. Find the primes for which 11 is a quadratic residue. 


. Define the Jacobi symbol (k/m) for relatively prime integers k, m 


(m > 0) by factorizing m = p{!...p& and writing 


(k/m) = (k/p1)" ... (k/py)*. 


If k, r are prime to m and k =r (mod m), show 
(k/m) = (r/m). 


If (k/m) is the Jacobi symbol of Exercise 17, for m positive and odd, 
prove 


(i) (-1/m) = (-1)"-Y?, 
(ii) (2/m) = (-1)e" V8, 


For k, m positive and relatively prime, prove 


(k/m)(m/k) = (—1)@-Dm-/4, 
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B 


Dirichlet’s Units Theorem 


Tn this appendix we look a little more deeply at properties of the units in the 
integers of a number field. These properties are significant for the general 
theory, but are not essential to our development of Fermat’s Last Theorem. 
Units are important because, while ideals are best suited to technicalities, 
there may come a point at which it is necessary to return to elements. 
But the generator of a principal ideal is ambiguous up to multiples by a 
unit. To translate results about ideals to their corresponding generators, 
we therefore need to know about the units in the ring of integers. The most 
fundamental and far-reaching theorem on units is that of Dirichlet, which 
gives an almost complete description, in abstract terms, of the group of 
units in the integers of any number field. In particular it implies that this 
group is finitely generated. We shall prove Dirichlet’s theorem in this ap- 
pendix. The methods are ‘geometric’ in that we use Minkowski’s theorem, 
together with a ‘logarithmic’ variant of the space L*’. 


B.1 Introduction 


We have already described the units in the integers of Q(W/d) for negative 
squarefree d in Proposition 4.2. For d = —1 the units are {+1,-+¢}, for 
d= —3 they are {+1,+w,-+w?} where w = e?7*/3, and for all other d < 0, 
the units are just {+1}. 

We see that in all cases U is a finite cyclic group of even order (2, 4, or 
6) whose elements are roots of unity. (It is in any case obvious that every 
unit of finite order is a root of unity in any number field.) 
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For other number fields the structure of U is more complicated. For 
example in Q(/2) we have 


(1+ V¥2)(-1+ V2) =1 


soe = 14 V2 is a unit. Now e€ is not a root of unity since |e] = 14+ /2 41. 
It follows that all the elements + «” (n € Z) are distinct units, so U is an 
infinite group. In fact, though we shall not prove it here, the -te” are all 
the units of Q(V/2); hence U is isomorphic to Zo x Z. 

After we have proved Dirichlet’s theorem it will emerge that this more 
complicated structure of U is in some sense typical. 


B.2 Logarithmic Space 


Let K be a number field of degree n = s + 2t, as in Chapter 8, and let L** 
be as there described. We use the notation of Chapter 8 in what follows. 


Define a map 1: Lt > Rett 


as follows. For x = (%1,... ,%s}2s41,--- ,€s+4) € L** put 


nee log |z,| fork =1,...,s 
WM) | log aryl? fork =s+1,...8+t. 


I(x) = (i (x),... ,ls+e(z)). 
The additive property of the logarithm leads at once to the property 


U(ay) = U(x) + Uy) (B.1) 


for z,y € L**. The set of elements of L** with all co-ordinates non-zero 
forms a group under multiplication, and J defines a homomorphism of this 
group into R‘**, From Formula (8.1) of Chapter 8 it follows that 


s+t 


3 be(2) = log |N(2)|. (B.2) 


h=1 


Then set 


For a € K we define 


I(a) = I(o(a)) 


where o : K — L* is the standard map. This ambiguity in the use of | 
causes no confusion, and is tantamount to an identification of a with o(a). 
Explicitly we find that 


I(a) = (log |o1(a@)|,... , log |os(a)|, 


log |os41(@)|?,... ,log jos4+(@)|*). 
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The map |: K — R*** is the logarithmic representation of K, and R*t* 
the logarithmic space. 
From Formula (8.2) of Chapter 8 and (B.1) above, it follows that 


K(aB) = l(a) +1(B) (a,8€ K) 


so that | is a homomorphism from the multiplicative group K* = K \ {0} of 
K to the additive group of R*+*. Further we have, setting I,(a) = |,(a(a)), 


s+t 


do 1e(@) = log |N(a)|, 
k=1 


using (8.3) of Chapter 8 and (B.2) above. 


B.3 Embedding the Unit Group in Logarithmic Space 


Why all these logarithms? Because the group of units is multiplicative, 
whereas Minkowski’s theorem applies to lattices which are additive. We 
must pass from one milieu to the other, and it is just for this purpose that 
logarithms were created. 

Let U be the group of units of D, the ring of integers of K. By restriction 
we obtain a homomorphism 


1:U +R, 


It is not injective, but the kernel is easily described: 


Lemma B.1. The kernel W of 1: U > R°** is the set of all roots of unity 
belonging to U. This is a finite cyclic group of even order. 


Proof: We have I(a) = 0 if and only if |o;(a)| = 1 for all i. The field 
polynomial 
[[@- (2) 


lies in Z[t] by Theorem 2.6 (a) combined with Lemma 2.13. We can there- 
fore appeal to Lemma 11.6 to conclude that all the o;(a) are roots of unity, 
in particular a itself. 

The image o(O) in L® is a lattice by Corollary 8.3, so it is discrete by 
Theorem 6.1. Since the unit circle in C maps to a bounded subset in L* 
it follows that 0 contains only finitely many roots of unity, so W is finite. 
But any finite subgroup of K* is cyclic (see Stewart [71], Theorem 16.7, p. 
171). Finally, W contains —1 which has order 2, so W has even order. UJ 
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Obviously the next thing to find out is the image of U in R°+*. Let us 
call it E. We have: 


Lemma B.2. The image E of U in R‘** is a lattice of dimension < s+t—1. 


Proof: The norm of every unit is +1, so for a unit €« we have 


s+t 
Sle (€) = log |N(€)| = log 1 = 0. 
k=1 


Hence all points of EF lie in the subspace V of R*+* whose elements (21,... , 254+) 
satisfy 
tyt...+ 2544 = 0. 


This has dimension s + ¢ — 1. 

To prove F is a lattice it is sufficient to prove it discrete by Theorem 
6.1. Let || || be the usual length function on R°*+*. Suppose 0 <r € R, 
and 


Ile) <r. 
Now |lx(e)| < ||l(e)|| <r, from which we get 


lon(e)<e" (k=1,...,8) 
leas <e” G=1,...,0). 


Hence the set of points o(e) in L® corresponding to units with ||I(6)|| <r 
is bounded, so finite by Corollary 8.3. Hence F intersects each closed ball 
in R‘+* in a finite set, so is discrete. 

Therefore E is a lattice. Since E C V it has dimension < s+t-—1. UO 


Already we know quite a lot about U. In particular, U is finitely gen- 
erated; for W is finite and U/W & E is a lattice, so free abelian, with rank 
<s+t-—1. All that remains is to find the exact dimension of the lattice 
F. In fact it is s+ ¢— 1, as we prove in the next section. 


B.4 Dirichlet’s Theorem 


The main thing we lack is a topological criterion for deciding whether a 
lattice E in a vector space V has the same dimension as V. We remedy 
this lack with: 
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Lemma B.3. Let LE be a lattice in R™. Then L has dimension m if and 
only if there exists a bounded subset B of R™ such that 


R” = User zt B. 


Proof: If L has dimension m then we may take for B a fundamental 
domain of L, and appeal to Lemma 6.2. 

Suppose conversely that B exists but, for a contradiction, LZ has dimen- 
sion d < m. An intuitive argument goes thus: the quotient R™/L is, by 
Theorem 6.6, the direct product of a torus and R™~?. The condition on 
B says that the image of B under the natural map v: R™ —+ R™/L is the 
whole of R”/L. But because B is bounded this contradicts the presence 
of a direct factor R™-¢ which is unbounded. By taking more account of 
the topology than we have done hitherto, this argument can easily be made 
rigorous. Alternatively, we operate in R™ instead of R™/L as follows. 

Let V be the subspace of R™ spanned by L. If Z has dimension less than 
m, then dim V < dim R™. Hence we can find an orthogonal complement 
V' to V in R™. The condition on B implies that R™ = U,cey v+ B, which 
means that V’ is the image of B under the projection 7: R™ > V’. But 
m is distance-preserving, so V’ is bounded, an obvious contradiction. O 


In fact, what we are saying topologically is that L has dimension m if 
and only if the quotient topological group R™/L is compact. (This can 
profitably be compared with Theorem 1.17. In fact there is some kind 
of analogy between free abelian groups and sublattices of vector spaces; as 
witness to which the reader should compare Lemma 9.3 and Theorem 1.17.) 


Before proving that E has dimension s+t¢—1 it is convenient to extract 
one computation from the proof: 


Lemma B.4. Let y € L* and let A, : L** + L* be defined by A, (x) = yz. 
Then Ay is a linear map and 


det A, = N(y). 
Proof: It is obvious that A, is linear. To compute det A, we use the basis 
(8.4) of Chapter 8. If 
y = (21,-.- 5255 yr + 421,... ye + tz) 


then we obtain for det Ay the expression 
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eal 
Zs 0 
Y1 —21 
4 Yi 
0 
Ye —2t 
2 Yt 
which is 
a... 2e(yt + 2f)... (yf + @) = Ny). q 


The way is now clear for the proof of: 
Theorem B.5. The image E of U in R5** is a lattice of dimension s+t—1. 
Proof: As before let V be the subspace of R°+* whose elements satisfy 
My+...+ 2544 =0. 


Then E£ CV, and dim V = s+t-—1. To prove the theorem we appeal to 
Lemma B.3: it is sufficient to find in V some bounded subset B such that 


V= UeecE e+ B. 
This ‘additive’ property translates into a ‘multiplicative’ property in L*. 
Every point in R‘t* is the image under | of some point in L**, so every 


point in V is the image of some point in L**. In fact, for z € L**, we have 
l(x) € V if and only if |N(z)| =1. So if we let 


S={reL™: |N(z)| =1} 
then 1S) = V. If Xo C S is bounded, then so is I( Xo), as may be verified 
easily. If x € S then the multiplicativity of the norm implies that xX C S$ 
if Xo C S. In particular if € is a unit then o(€)Xp C S. So if we can find a 
bounded subset Xo of S such that 
S = Uceu o(€)Xo (B.3) 


then B =1(Xpo) will do what is required in V. 
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Now we find a suitable Xo. Let M be the lattice in L** corresponding 
to 9 under o. Consider the linear transformation A, : L* > L**(y € L®) 
of Lemma B.4. If y € S then the determinant of A, is N(y) which is + 1. 
Therefore A, is unimodular. This, by the remark after Lemma 9.3, implies 
that any fundamental domain for the lattice yM(= A,(M)) has the same 
volume as a fundamental domain for M. Call this volume v. 

Choose real numbers c; > 0 with 


4 t 
Q=a1...co4> (2) Vv. 
T 


Let X be the set of z € L® for which 

lag eg. > (Hj oug 8) 

feng Seip GST aesth 

Then by Lemma 9.2 there exists in yM a non-zero point x € X. We have 

z=yo(a) (0OAa€EDN). 
Since 

N(z) = N(y)N(a) = +N(a) 
it follows that 
IN(a)| < Q. 


By Theorem 5.17 (c) only finitely many ideals of D have norm < (. Con- 
sidering principal ideals, and recalling that the generators of these are am- 
biguous up to unit multiples, it follows that there exist in 0 only finitely 
many pairwise non-associated numbers 


Oy,...,Aan 


whose norms are < @ in absolute value. Thus for some i = 1,...,N we 
have ae = q;, with € a unit. It follows that 


y = zo(a;*)o(e). (B.4) 
Now define 
Xo = SN (UNyo(a;")X). (B.5) 


Since X is bounded so are the sets o(a;+)X, and since N is finite Xo is 
bounded. Obviously Xo does not depend on the choice of y € S. 
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But now, since y and o(e) € S, we have zo(a,') € S, hence xa(a;*) € 
Xo. Then (B.4) shows that 


y € ale) Xo. 


Hence (B.3) holds, y being an arbitrary element of S, and the theorem is 
proved. oO 


We can easily put this result into a more explicit form, obtaining the 
Dirichlet Units Theorem: 


Theorem B.6. The group of units of D is isomorphic to 
WxZx...xZ 


where W is as described in Lemma B.1 and there are s +t —1 direct 
factors Z. 


Proof: By Theorem B.5, U/W =~ Zx...x Z = Zt, Since W is 
finite it follows that U is a finitely generated abelian group, hence a direct 
product of cyclic groups (see Fraleigh [25], Theorem 9.3, p. 90). Since W 
is finite and U/W torsion-free, it follows that W is the set of elements of 
U of finite order, which is the product of all the finite cyclic factors in the 
direct decomposition. The other factors are all infinite cyclic; looking at 
U/W tells us there are exactly s + ¢—1 of them. O 


In more classical terms, Dirichlet’s theorem asserts the existence of a 
system of s++t—1 fundamental units 


Mly-++ > Ms+t—1 


such that every unit of D is representable uniquely in the form 
Conf nS 
for a root of unity ¢ and rational integers 7;. 

Let us return briefly to Q(./2), which we looked at in Section 1 of this 
chapter. For this field, s = 2, = 0,sos+t—1= 1. Hence U is of the 
form W x Z where W consists of the roots of unity in Q(/2). These are 
just + 1, so we get U = Ze x Z as asserted in Section 1. Note, however, 
that we have still not proved that 1 + /2 is a fundamental unit. In fact, 
this is true in general of Dirichlet’s theorem: it does not allow us to find 
any specific system of fundamental units. Other methods can be developed 


to solve this problem, and the Dirichlet theorem is still needed to tell us 
when we have found sufficiently many units. 
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B.5 Exercises 


1. 


Find units, not equal to 1, in the rings of integers of the fields Q(V/d) 
for d = 2,3, 5,6, 7, 10. 


. Use Dirichlet’s theorem to prove that for any square-free positive 


integer d there exist infinitely many integer solutions z, y to the Pell 
equation 


z* — dy? = 1. 


(Really this should not be called the Pell equation, since Pell did not 
solve it. It was mistakenly attributed to him by Euler, and the name 
stuck.) 


. Prove that 1+ V2 is a fundamental unit for Q(Vv2). 


. Let m,-..,%s+t-1 be a system of fundamental units for a number 


field K. Show that the regulator 


R = |det (log |o:(n;)|)| 


is independent of the choice of 1,... , s+#-1- 


. Show that the group of units of a number field K is finite if and only 


if K = Q, or K is an imaginary quadratic field. 


. Show that a number field of odd degree contains only two roots of 


unity. 
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